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When interviewing first came into use 
as a technique of empirical social re- 
search, scientists hoped that it might 
provide answers to many perplexing 
problems. But, after a series of contra- 
dictions of results obtained through 
mass interviews, severe doubts were 
cast upon its validity—it was believed 
that the bias of the interviewer himself 
affected results so significantly as to 
render meaningful evaluation. impos- 
sible. Interviewing in Social Research 
re-examines the interview in the light 
of this criticism in an effort to determine 
the real effect of the relationship be- 
tween the interviewer and the person 
interviewed. 


Surprisingly, the conclusion reached by 
the authors is that the effect of the bias 
introduced by the interviewer is com- 
monly so small as to be practically neg- 
ligible—a fact which in itself is sufficient 
to make this book a welcome and im- 


portant addition to the literature of 
social science, 


However, while this study is primarily 
Concerned with the factor of error in 
interviewing, it has been conceived in 
the larger framework of the interview 
as a generic method of the social sci- 
ences. Reports are included on new 
experimental and field studies, and a 
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Foreword 


What people say as well as what people do are woven into the fabric 

of history, 
_ In the last three decades there has been a technological revolution 
in the recording of the spoken opinions of representative samples of 
entire populations, Historians, so many of whom are humanists with an 
aversion to tools like statistical devices for sampling or like IBM 
punched cards for analysis, have been slow, indeed, to recognize the 
Significance for their profession of this technological revolution. 

But not so other social scientists, especially in social psychology and 
sociology. And not so industrial organizations, which have found that 
Systematic sampling of opinion of management and workers, of cus- 
tomers, and of the general public can provide facts indispensable for 
policy-making. . 

This technological revolution has depended on many inventions. One 
Class of inventions involves the application of mathematical principles 
to the selection of a sample which can reproduce the responses of a 
Population, with a small calculable error. Another class of inventions 
involves the development of scales of measurement. Still another in- 
volves a host of new techniques of analysis, facilitated by the miracles 
Performed by new electronic computing machines. A large and cumu- 
lative literature of new ideas and criticism is helping to make each of 
these Processes more effective. : 

More study is needed. But there is one link in this effort, in particular, 
Which thus far has not received as much constructively critical examina- 
Uon as its importance deserves. This link is the human middleman in the 
normal process of eliciting opinions—the interviewer. In spite of the 
obvious possibility that bias; conscious or unconscious, of the inter- 
viewer might cause serious bias in reponses, there has been surprisingly 
20е systematic study of the interviewer and the interviewing process 
pedi. To help fill this gap, the present volume provides a much-needed 
und of information. M 
The major studies which led to this book had their genesis in a com- 
ined interest, shortly after World War П, expressed by the National 

esearch Council, on behalf of the natural sciences, and the Social 
жы Research Council, on behalf of the social sciences. A joint com- 

tee of the two councils was established, called the Committee on the 


v 
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Measurement of Opinion, Attitudes, and Consumer Wants. The com- 
mittee was a diverse body, comprising mathematicians, social scientists, 
leading practitioners of public opinion research, and representatives of 
important consumers of applied research—in advertising agencies, in- 
dustrial establishments, and such associations as the American Standards 
Association and the American Society for Testing Materials, Among 
the first problems which this committee examined was the need for a 
Systematic study of the interviewer and the interviewing process. The 
authors of the present volume were urged to undertake this study, with 
financial assistance provided through the vision of the Rockefeller 
Foundation. 

The committee takes no credit for and assumes no responsibility for 
the conclusions of the authors, who were left completely free to publish 
any findings they chose, without prior review or criticism. As chairman 
of the committee, however, the writer of this Foreword wishes per- 
sonally to commend this volume to a wide reading public. 

This volume is important both in its substantive findings and in the 
ways it reaches them. The authors are not “mere technicians,” Sophisti- 
cated in psychological and sociological theory, they have molded 
theory into operational propositions and have put these propositions 
to the test with actual field investigations and experiments, 

What they say is not intended to be the last word on the subject. 
Standing on the shoulders of work previously done, they give us a 
clearer and wider vision than we have ever had before of the human 
element in the interviewing process. It is to be hoped that future in- 
vestigators, standing in turn on the shoulders of the present authors, 
can, in the years ahead, extend the vista further. 

For the work of ascertaining the thoughts and wishes of people, their 
hopes and frustrations, their attitudes and values, is becoming ever more 
important to the complex world in which we live. 


SAMUEL A. Srourrer 
Harvard UNIVERSITY 


Preface 


In 1947 the National Opinion Research Center undertook to study 
systematically the sources of error in research that depends upon in- 
terviewing as a method of data collection.* The purposes of the study 
Were (1) to determine and evaluate empirically the factors that may 
Operate within the interview to produce error in the data derived 
from it and (2) to test the amenability of these factors to methods of 
Control designed to minimize their effects. 

In the effort to attain these objectives, the first step was to collect or 
to construct a complement of hypotheses concerning the nature and 
Mode of operation, under varying circumstances, of error-producing 
factors, This involved not only a thorough critical search of the specu- 
lative and research literature but also an assessment of materials in the 
files of research agencies and consultation with experienced research 
Persons to discover any hunches that had arisen out of their experience. 
It involved, further, careful scrutiny and analysis of the interview 
Situation through empirical observation of interviewing under both 
Natural and experimentally contrived conditions and through clinical 
Interviews with experienced interviewers and their respondents. 

The hunches and hypotheses resulting from this first phase of the 
Study guided all subsequent phases. A careful search was made of the 
Tesearch literature to discover studies which bore directly or indirectly 
9n any of the hypotheses. The hypotheses were further tested, as cir- 
cumstances permitted, by setting up quasi-experimental projects in con- 
nection with studies made primarily for other purposes, either by 
a or by some other research agency or person. Finally, hypothe- 

Pertaining to error-producing factors that seemed to operate quite 
del or with weighty effect were studied more thoroughly and 

efinitively through specially designed experimental studies. 
is БА Present volume developed out of this program of research and 
Ically a report of findings. But it is more than that; it has turned 
a be а treatise on interviewing as a method of inquiry in the social 
Ces, with special attention to sources of error and their control. It 


* 
Th i Ww : 1 i i 
Res ese studies were commissioned by the Joint Committee of the Social Science 


nA Council and the National Research Council on the Measurement of 
Pe das tons, Attitudes, and Consumer Wants. They were originally projected to 
PY a period of two years; as it turned out, they extended over nearly six years. 
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begins with a documented exposition of the universal dependence of 
social scientists upon interviewing and an appraisal of the self-con- 
sciousness or sophistication of specialists in the various areas of social 
scientific study with respect to conditions affecting reliability of the 
method (Chap. I). It then examines, on the basis of previous research, 
supplemented by qualitative data developed in the course of our own 
studies, the nature of the interview situation and, against this descriptive 
background, develops plausible hypotheses about the factors that tend 
to produce error (Chap. II). The volume then examines successively 
the operation of these factors in the interviewer (Chap. III), in the 
respondent (Chap. IV), and in dynamic and variable relationships be- 
tween interviewer and respondent under the impact of situational fac- 
tors that are largely external to both of them (Chap. V). Finally, atten- 
tion is given to the nature and significance of the effects produced by 
these factors under normal operating conditions (Chap. VI) and to 
various methods of measuring and reducing these effects (Chap. VII). 

"Throughout this work the contributions of Mr. Hyman have been 
pre-eminent. He directed the research and largely planned and wrote 
this report. But, under his guidance, the entire project has been a genu- 
inely collaborative one. Besides serving as consultant on virtually all 
phases of the research and handling the statistical analysis and interpre- 
tation of results in several specific studies, Mr. Feldman prepared the 
original draft of Chapter VI. Mr. Stember and Mr. Cobb were also 
perennial consultants and collaborators on phases of the research, and 
they jointly prepared the original draft of Chapter IV. In addition, Mr. 
Stember prepared the original draft of Chapter V, and Mr. Cobb the 
original draft of Chapter VII. 

But many persons besides those listed as author and associate authors 
made substantial collaborative contributions. Don Cahalan, Gordon 
Connelly, and Miss Anne Schuetz rendered invaluable assistance in de- 
veloping the original rsearch plans. 
of Hugh Parry, Helen Crossley, 
University of Denver, had a lar 
the Denver study of validity an 


Mr. Cahalan also, with the assistance 
and other members of his staff at the 
ge hand in planning and carrying out 


- d interviewer variance, Paul Sheatsley's 
specific research contributions are only partially indicated in the text; 


over and above these, he advised continuously on rescarch planning and 
on interpretation of research findings and read critically every research 
report as well as most of the text that follows. Miss Shirley Star and Eli 
Marks also gave cogent advice and criticism in connection with many 
of the problems that arose during the course of this work. 

Mrs. Ruth Blumenstock Cooperstock deserves special mention here 
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because of her indefatigable work in digesting the research literature on 
interviewer effect. A special note of appreciation is due also to Mrs. 
Michael McGarry, Mrs. Nella Siefert, and Mrs. Ada Caplow for their 
capable assistance in the preparation of the manuscript and in the con- 
struction of the Index. 

Representatives of a large number of research agencies—academic, 
governmental, and commercial—not only contributed helpful ideas but 
also made available collections of data from their files, reshaped their 
own studies at times to make them serve better some need incident to 
our research, and occasionally participated jointly with NORC in de- 
signing and executing some quasi-experimental study. Especially help- 
ful in this connection and in the critical reading of special research re- 
ports, as well as portions of this volume, were Daniel Katz, Herman 
Witkin, and Lester Guest. We are particularly indebted to Frederick F. 
Stephan, W. Edwards Deming, Samuel A. Stouffer, and Leland C. De- 
Vinney for their constructive advice and assistance in connection with 
Major aspects of the work. In the early phases of the research, also, we 
received many helpful suggestions from members of the Committee on 
the Measurement of Opinion, Attitudes, and Consumer Wants, who 
gave us on numerous occasions their constructive advice and sympa- 
thetic support. 

Finally, we wish to acknowledge our indebte 
Foundation for its generous support of our research. . . 

Certain precautionary statements made at appropriate points in the 
following pages are generally applicable to this work and should be 
constantly borne in mind by the reader. Research inevitably reflects 
the reality conditions prevailing at the time the research is done. The 
Concepts ‘and methods employed in any study cannot, jn most instances, 
greatly transcend the current state of scientific knowledge and the 
limited research facilities, including trained personnel, currently availa- 
ble. The research that has been done on sources of error in the inter- 
View—our own research as well as that of others—is obviously subject 
to such limitations, In nearly all studies the subjects used were inter- 
viewers who were available in research agencies and in colleges and 


universities where research training is undertaken. Generalization of 


Our conclusions to researchers of greater maturity and sophistication 
h due and proper cau- 


than these subjects has to be made, therefore, wit 
Чоп. It would be dangerous, however, though consoling, for the mature 
and sophisticated interviewer to assume that he is not equally subject 
to the operation of the same error-producing factors affecting the 
varied group of interviewers covered by the studies we are here report- 


dness to the Rockefeller 
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ing. As a matter of fact, the available evidence suggests that, while the 
sophisticated interviewer may be less subject to variable errors of a 
careless sort, he is probably equally subject to certain serious biasing 
errors. Moreover, it seems likely that much research will continue to 
depend upon the co-operation of relatively large numbers of interview- 
ers of substantially the same caliber as those used in previous studies 
of interviewer effect. 

Of course it is not to be assumed that the way to reduce or eliminate 
error is simply to adapt one’s research objectives and methods to what 
one assumes to be the present level of competence of available inter- 
viewers. That kind of accommodation could only freeze performance 
at its present level and continuously impoverish research. As we attempt 
to make clear in this report, it seems to us that there are other remedies 
to be employed, even within the limits imposed by 
conditions. Many of these remedies pertain more 
himself than to the interviewers upon whom he must depend. Even so, 
there should be continued and enlightened effort, particularly in large- 
scale research undertakings, which involve dependence on a corps of 
field interviewers, to lift the level of interviewer competence. 

It should be clearly borne in mind, too, that reduction or elimina- 
tion of interviewer effect is only one of many considerations which the 
designer of a survey must take into account in defining his objectives 
and setting up his procedures. Obviously, one would not wish to im- 
pose restraints upon interviewers which would so impair their effective- 
ness as to make the interviews relatively sterile. One certainly would 
not forego using a type of question which, though it increased the like- 
lihood of bias, provided the only availa 
roughly, the dimensions of a certain v 
and all other areas, a doctrinaire attit 
tant considerations are, first, that the 
sistent with his larger purposes to 
reliable and, second, that he know 
ognize willingly and clearly the li 
There is reason to believe that n 
are amenable to improvement in thi 
ing efficiency in other respects. 


prevailing reality 
to the researcher 


ble means of gauging, even 
ariable. In this area, as in sampling 
ude is to be avoided. The impor- 
researcher make every effort con- 
secure results that are valid and 
what risk of bias he is taking and rec- 
mitations it imposes on his endeavors. 


Crypr W. Hart 
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CHAPTER І 


A Frame of Reference for the Study 
of Interviewer Effect 


1. THE SETTING OF THE PROBLEM 


Interviewing as a method of inquiry is universal in the social sciences. 
The literature of anthropology is a product of the interviewing of in- 
formants. Sociologists have made wide use of the method. The writings 
of psychiatrists, clinicians, and psychoanalysts about man and society 
had their beginnings in an interviewing situation—diagnostic and thera- 
peutic interviews with patients. The periodic censuses of the United 
States and other countries are monuments to the interview method, and 
the thousands of students making use of these historical archives, 
whether conscious of it or not, cannot ignore their ultimate dependence 
on interview data. New applied fields cutting across the classic disci- 
plines—human relations, industrial relations, communications research, 
area studies—all make use of interview data. Public opinion research, 
as a common resource of the political scientist, public administrator, 
social psychologist, and historian is built upon the foundations of inter- 
viewing. 

It is clear therefore that fundamental inquiry into the problem of 
interviewing may have wide ramifications and general value far be- 
yond the specific context of survey research within which this study 
was initiated. Yet the very universality of interviewing as a method and 
the infinite variety of the procedures subsumed under the term create 
a difficulty. No single investigation—not even a score of investigations 
—could bear directly upon all the concrete forms and manifestations 
which interviewing takes. Inevitably, some of the principles to be de- 
veloped, some of the quantitative findings that will be generated, and 
particular procedures to be recommended after examining the weight 
of our evidence may not be applicable to the interviewing problems 
of readers in particular fields. Note how contrary to our rules and ex- 
perience in modern survey research the following prescription for 
Proper social research interviewing is:* 


The interviewer must have a very good memory. The information has to 
be obtained in the course of general conversation. . . . Usually the inter- 
viewer has to remember all the answers he has obtained and write them out 
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after he has returned to his own place. . . - Usually he has to talk a good 
deal about general topics, partly to show that he understands the conditions 
in the region and partly that he is interested in acquiring new knowledge. It 
will not do for him to make it plain that his interest is to obtain statistical 
information. . . . It will not do for the interviewer to ask one question after 
another even when the respondent has shown a willingness to talk... - 
Sometimes several questions worded differently have to be asked in order to 
obtain one answer, if the first or first few answers are not satisfactory. In 
such cases these questions . . . must not follow one after another, but other 
questions or general discussion should intervene in order to take the re- 
spondent off guard, or to make him understand exactly what information 
is wanted . . . In some cases some sort of pressure has to be exercised on 
the respondent. The pressure must not be so great as to make the respond- 
ent feel he is under compulsion to supply information, nor should it be so 
slight that he may disregard it entirely. 


Yet who is to say that there are not particular conditions under which 
this prescription is appropriate. 

The foregoing quotation is from a description by the Chinese repre- 
sentative on the U.N. Statistical Commission of the interviewer's task 
in collecting information, developed out of the difficulties of initiating 
statistical inquiries among the Chinese people. Licu even commends 
to the interviewer such bizarre behavior, arising out of the requirements 
of his research situation, as the following: “In the production of pol- 
ished rice, he must know the quantity that can be obtained from а 
picul of paddy,” and “the interviewer must choose his respondents, 
which sometimes makes random sampling very difficult.” 

Inevitably, any empirical research on interviewing method can only 
sample a fragment of so vast an area; yet we seek findings of some gen- 
erality. Even if we were to limit the area to that of public opinion in- 
terviewing within America, we would still encompass such a diversity 
of procedures, topics, problems, respondents, and interviewers that 2 
single methodological inquiry would seem to be gravely inadequate. 
There is one solution that is available. It is that while we operate within 
a narrow realm in the concrete sense we shall focus on fundamental 
processes within the interview that transcend our specific research set- 
ting. That is why a survey specialist seeking specific and elaborate pr 
scriptions and remedies will not find them in this report. They might be 
inappropriate to his own current interviewing problems; they woul 
certainly be obsolete by 1970; and they would have little relevance to 
the larger social science audience. As Roethlisberger and Dickson state 
in their discussion of interviewing method:* 


It is evident that the interviewing of a child, a psychoneurotic, a native of 
a primitive community, or the normal adult of a civilized community in- 
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volves different modifications in the way the interview takes place... . 
There is always the danger for the beginner that he attach a significance to 
the rules of performance that they do not have. He tends to treat them as 
absolute prescriptions which should never be violated and he tends to mul- 
tiply them without end... rules for conducting the interview are sub- 
Stituted for understanding. 


. In order for us to increase our fundamental understanding, we must 
inquire, for example, into the social and psychological meaning of an 
Interview for the two parties involved. We shall explore some of the 
Cognitive and motivational processes operating within the interviewer. 
We shall ask how his behavior is molded by these processes but in turn 
modified by the nature of his task. We shall examine some of the re- 
actions of the respondent when he is confronted by an interviewer. 
Then, we shall elaborate on the relation of errors in the data to ongoing 
Processes within the humans who operate in interviewing situations of 
Various types. By the elaboration of data and theory about such more 
general and abstract features of алу interview, we shall hope to achieve 
Some degree of generality. . 

The concrete materials on which this study is based will, of course, 
have immediate relevance to the activities of current survey agencies, 
and data on the magnitude and control of error will be presented in 
detail, Implicit in that presentation is the limitation that the quantitative 

ndings relate only to the current operations of some public opinion 
agencies, But it is our hope that no such limitation will affect the larger 
and more theoretical features of this report. 

In presenting any detailed research report on one phenomenon, one 
pecu. d excludes from discussion many other phenomena which may 
oe z the problem. Thus, in eine н ее ир. 

ect, we may run the danger of narrowing 
Much. In order that the reader should have what we would regard as 
the appropriate perspective for interpreting our ultimate findings, we 
shall first discuss some broader matters. 


2. THE EVALUATION OF ERROR—QUANTITATIVE EVIDENCE 


The present report is in the nature of a dangerous confession. Re- 
Search workers using the survey method are willingly exposing them- 
Selves to criticism by reporting on a most comprehensive study and 

eMonstration of errors in their findings. This is dangerous, for the 
enu l reaction may be to damn the method summarily because of its 
= It is therefore of the utmost importance to evaluate the study 

€monstration of error in a proper manner. 
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Let it be noted that the demonstration of error marks an advanced 
stage of a science. All scientific inquiry is subject to error, and it is far 
better to be aware of this, to study the sources in an attempt to reduce 
it, and to estimate the magnitude of such errors in our findings, than 
to be ignorant of the errors concealed in the data. One must not equate 
ignorance of error with the lack of error. The lack of demonstration 
of error in certain fields of inquiry often derives from the nonexistence 
of methodological research into the problem and merely denotes а 
less advanced stage of that profession. 

We are here studying those errors which occur in survey research 
as a result of the method of personal interviewing. We shall find many 
instances of error, which might make the reader regard the interview 
procedure developed in the survey field as inferior to the interview pro- 
cedures used in other types of scientific research. Yet in some of these 
other fields, the errors committed by interviewers may conceivably far 
exceed those we will demonstrate. 

Social anthropology rests in great measure upon information col- 
lected through the interviewing of informants. That such interviewing 
is not free from unreliability is clear from occasional discrepancies be- 
tween the published reports of different ethnologists who have hap- 
pened to study the same society. 

For example, Murdock’s observations of the Tenino of central Ore- 
gon differed from earlier reports by other anthropologists.* Different 
anthropologists have offered sharply discrepant accounts of Pueblo 
culture despite obvious lack of independence in the observations.* Other 
more elaborate instances present themselves. The village of Tepoztlan as 
described by Lewis is quite different from the same village as it was de- 
scribed earlier by Robert Redfield. In summarizing the differences be- 
tween the two studies, Lewis remarks: “The impression given by Red- 
field’s study of Tepoztlan is that of a relatively homogeneous, isolated, 
smoothly functioning, and well-integrated society made up of а con- 
tented and well-adjusted people. His picture of the village has a Rous- 
seauian quality which glosses lightly over evidence of violence, dis- 
ruption, cruelty, disease, suffering and maladjustment. We are tol 
little of poverty, economic problems, or political schisms. Throughout 
his study we find an emphasis upon the cooperative and unifying fac- 
tors in Tepoztecan society. Our findings, on the other hand, woul 
emphasize the underlying individualism of Tepoztecan institutions and 
character, the lack of cooperation, the tensions between villages within 
the municipio; the schisms within the village and the pervading quality 
of fear, envy, and distrust in inter-personal relations.”® Despite their 
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common experience with the same society, Fortune contradicts Mar- 
garet Mead's account of the Arapesh: 


A theory has been advanced that this social culture “works, selecting one 
temperament, or a combination of related and congruent types, as desirable, 
and embodying this choice in every thread of the social fabric." According 
to this theory the entire Arapesh social culture has selected a maternal 
temperament, placid and domestic in its implications, both for men and 
Women. The theory has been applied to the cultural analysis of Arapesh 
Warfare, and has led to conclusions that "warfare is practically unknown 
among the Arapesh—the feeling towards a murderer and that towards a 
man who kills in battle are not essentially different—abductions of women 
аге not unfriendly acts on the part of the next community.” These conclu- 
Slons we, of course, must reject on the basis of our preceding evidence.? 


Such reports clearly demonstrate the existence of the problem. Yet 
9ne can find no single published methodological inquiry where the 
reliability of anthropological field-interviewing is systematically esti- 
mated through the deliberate procedure of assigning different field- 
Workers to make parallel studies. More than this, one finds only rarely 
specific studies any careful description of the procedures by which 
the data were obtained, which would permit some inference as to error. 
Thus Stavrianos examined all articles based on field research appearing 
1 Опе of the professional anthropological journals over a period of 
fifteen months, In five of the seven studies evaluated the method used 
the collection of data was not even described.’ 

, This is not to say that anthropologists are unaware of the problem of 
Interviewer effect or objectivity of data in general. As Lewis points 
Out, restudies of the same community are hindered by practical con- 
siderations such as “limited funds for field research, the time pressure 
af studying tribes who were rapidly becoming extinct, the shortage 
of field workers.” Linton, Radin, and others have also stressed the 
Problem and have suggested specific field procedures to insure scien- 
tific data» Mead has alluded very recently to the need for training 
Anthropology students “to form an estimate of their own strengths 
and Weaknesses as observers" and has made some bricf suggestions for 
studies of the conditions affecting errors of observation." Kluckhohn 
Ina Monograph devoted to the a of the interview and other personal 
Ocuments in anthropology repeatedly stresses the importance of the 
Problem and laments the neglect of it in the past. He remarks: 


The limi ; : i about their 
fi е limited extent to which ethnologists have been articulate 


e À i " 5 em Ti. 
techniques is astonishing to scholars in other disciplines. . . . Few in 


tery] ч > Д $ 
lews are printed and almost none in their entirety. Circumstances are but 


Partially sketched. . . . The role and participation of the observer is litte 
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detailed: one is not consistently told . . . how many questions and what 
questions the interviewer asked, whether notes were taken in the presence 
of the subject and others . . . somewhat comparable interviews under some- 
what standardized conditions are not presented and analyzed. . . . Particu- 
larly neglected in the past has been the responsibility of the anthropologist 
to report upon himself. . . . Anthropologists must realize that the “contra- 
dictions" between various personal documents from the same tribe may 


arise, not from different periods or different degrees of acculturation or from 
personal idiosyncracies of the several informants, 


but from the varying ap- 
proaches of the investigators. 


And he urges the development of experiments on interviewer ef- 
fect— 


The anthropological mode must become more objective both as regards 
gathering and analyzing data. This will be much facilitated by a number of 
needful experiments. Anthropology, in general, stands on the threshold of 
an epoch when the coarseness and crudeness of its work requires the re- 


finement which can only be brought by a partially experimental approach." 


Bartlett in the course of an interdisciplinary symposium with an- 
thropologists and other social scientists has similarly stressed the im- 
portance of reliability of observation under field conditions and recom- 
mended the joint application of a test approach for the prediction of 
efficiency of observation, and an experimental approach to the factors 
affecting goodness of observation in complex social situations. How- 
ever, these suggestions in the literature have not been accompanied by 
empirical work on the problem. Psychiatrists have also shown a rela- 
tive lack of inquiry into the quality of the data collected by psychiatric 
interviewing, Yet, psychiatric diagnosis rests essentially upon inter- 
viewing. Kempf remarked thirty years ago: ? 


If each important institution can be induced to give, sealed, to a central 
committee, its actual working 


х system for classifying cases as dementia 
praecox, manic-depressive, paranoia, hysteria, and neurasthenia, illustrated 
by cases, the differences would probably be so varied that the whole system 
would have to be abandoned because the faithful assumption that symptoms 


are similarly applied and evaluat d th ychi 
tee PP aluated throughout psy chiatry would be brutally 


That such differences in cla 


I ) 1 ssificatory systems would in turn lead 
to interviewer differences is 


"ү 1 patent, and concrete evidence will be pre- 
E in Here again there 15 critical awareness of the problem, but 
= ttle accompaniment in the way of massive empirical study of er- 

There is no intention to 


Р а disparage the intelligence of scholars in these 
other disciplines by remarking on this situation, The intention is merely 
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to set the proper framework for the reader in evaluating the data to 
follow. As a matter of fact, the most plausible explanation of the differ- 
ence in critical attention to interviewer error would seem to lie not in 
any greater natural sophistication of the survey researcher, but in the 
differing social organization of research in the respective sciences. Psy- 
chiatrists, anthropologists, and scholars in many other disciplines tra- 
ditionally work by themselves, whereas the systematic coverage of 
large Populations and the manipulation of masses of data in survey re- 
Search require the use of many scientists working co-operatively. It is 
this difference in the circumstances of work which affects the saliency 
of the problem of interviewer error and the ease of measuring it. Merton 
brings this interpretation forcefully to our attention in a discussion 
of the difference between the European scholar in the Sociology of 
Snowledge and the American researcher in Mass Communications. 


e Course, the generality of his remarks goes far beyond these two spe- 
cific fields. i ? 
The lone scholar is not constrained by the very structure of his vag situ- 
с to deal systematically with reliability as a technical problem. pin 
ioe and unlikely possibility that some other scholar, off at some о! er 
Расе in the academic community, would independently hit upon pa 
© same collection of empirical "materials, utilizing the same categories, the 


s A ; А a 
ае criteria for these categories and conducting the same intellectual оре 
pons. c he organization of the 


+». There i ; ly, very little in t 
E 5, consequently, Y. ; “ith the 
“opean’s work situation constraining him to deal systematically with 


to $ са ab E 
"gh problem of reliability of observation or reliability of analysis 


а 2 ituation, and 
By contrast, in survey research men work in a group situation, 


as . 
Merton puts it; 


sod nh research organization, the шее S ride The need for 
reliab; "Ing that it cannot be neglected or scantily reg B is in the field 
о ability of observation and analysis, which, of course, exists. nt in the 
hie at large, becomes the more visible and the Les icr deri "a 
the ‘ture confines of the research team. Different researc tions must pre- 
sumably cn он materials and pera we нр Sinne of the im- 
У reac s . Thus, the ver) В 
Mediate work upra аы and diverse collaborators reinforces 
Perennial concern of science, including social science, with objectivity; the 


Inter A ben 
Personal and intergroup reliability of data. 


, Merton's argument takes on added plausibility when we iu енн 
ins that the few instances where we find an elaborate heri Т ie 

“viewer differences in other fields are those where the normal iso 
ation of the individual worker has been altered in the direction of 
group organization of work. Thus, four of the major studies in psy- 


eliability becomes so 
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chiatry which we shall report shortly involved many um 
chiatrists screening large numbers of troops in the last w ar. Severz iut 
the Studies in Clinical Psychology come from military settings. Un oe 
wartime conditions, the availability of many observations by many 
clinicians made salient the problem of variation in diagnosis and pro- 
vided a natural opportunity to design experiments. 

What makes the interview method in all fields singularly exposed to 
criticism is the fact that the data collected are so clearly derived in an 
interpersonal situation. In other methods where the same sort of in- 
determinacy may actually operate, the visibility of the problem may 
not be so marked, and criticisms are unfairly reserved for the interview 
method. Thus, experimentation with animals is the basis for much of 
our knowledge in physiology and psychology. But when criticism of 
such experiments occurs, it is rarely, if ever, on the ground that the 
data are in part a product of the peculiar interpersonal relations be- 
tween animal subject and human experimenter. Such an argument 
seems too farfetched. While such sources of indeterminacy are no doubt 
small in magnitude, it is not beyond the realm of possibility that “inter- 


viewer effects” do occur. Liddell, whose classic research on condition- 
ing in animals extended over many years, remarks: 


Another fundamental characteristic of the method is the intimacy which 
develops during training between animal and experimenter. In the course of 
months or years this intimate relationship alters infallibly, first in the direc- 
tion of dependence and solicitation, but later toward avoidance or hostility. 
We believe that this feature of Pavlov's method differentiates the study of 
conditioned reflex action from investigations in essential physiology. In 
chronic physiological experiments of long duration the cooperation of the 
animal must be secured; but, within the limits which the physiologist im- 


poses upon his thinking, intimacy between animal subject and investigator is 
taken for granted and d 


oes not enter into the appraisal of the results of the 
experiment. 


More recentl 
the neglect by 
ditions as the 


y Christie has raised the issue in most general terms of 
animal experimenters of such "extra-experimental" con- 
previous experiences of the rats used.'^ 
add to this class of conditions the interpersonal relati 
and even demonstrates that these factors affect the res 
are rarely used as a basis for the selection of the anim 
tion of the findings. The indeterminacy is present, b 
because it is not so patent as in the survey interview. 
Granted the possibility of interview 
sciences making use of the interview, 


(We might well 
ons.) He argues 
ults observed but 
als or the evalua- 
ut neglected here 


er effects on the data in all social 
we might raise the specific issue 
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as to the actual occurrence and relative magnitude of interviewer ef- 
fects ih the survey and other fields. 

While it is impossible to estimate the magnitude of error typical of 
these fields because of the scarcity of empirical data, it can easily be 
established from the few studies available that interviewer effects do 
occur. For example, in psychiatry we have a number of large-scale 
studies revealing considerable variation in the results obtained by dif- 
ferent military psy chiatrists.”* 

Thus Star presents data on the frequency of rejection for general 
Psychiatric reasons and the specific psychiatric classification applied 
for a group of 107,000 recruits screened by different psychiatric exam- 


iners during the month of August, 1945, at U.S. Army Induction Cen- 
Сегѕ 29 

Since the interviewers used were not all of the highest professional 
training and the brief screening interview was hardly sufficient time 
for comprehensive examination, the results may overstate the general 
Seriousness of the problem of reliability in psychiatric interviewing. 


Nevertheless, they demonstrate clearly that there is such a problem. 

The range in proportion rejected for psy chiatric reasons was “from 
5% at Camp Beale, California, to 50.6% at Manchester, New Hamp- 
shire. . .. Not only was there wide variation in the psychiatric re- 
Jection rates, but also there was wide variation in the specific diagnoses 
Siven for these psychiatric rejects. While in the nation as a whole, 
39.9% of all psychiatric rejects were diagnosed as psychoneurotic, the 
Percentages varied among stations with at least 50 rejects, all the way 
from 2.7 to 902. . . . It might be argued, by way of explaining such 
x US variability in diagnosis, that the statistics . . + represent a 
aithful picture of the actual incidence among the populations drawn 


Into these induction stations. This argument would be easier to support 
ewhat the same rates and 


if the stations within a given region had som ; 
if the variability within regions was much less than the variability be- 
tween regions. But when Pittsburgh had 3 times the proportion of psy- 
chiatric rejects of Philadelphia, when Detroit had 3 times the propor- 
tion of Chicago, New Orleans 3 times the proportion of Dallas, and 
Seattle-Portland 3 times the proportion of San Francisco, it is difficult to 
believe that the standards were the same in all places.” | 

Similar evidence is available in the experiences of the United States 
Navy in World War II. Hunt and Wittson in discussing sources of er- 
ror in neuro-psychiatric statistics remark: 


Ta further source of erroncous diagnoses enters with the prevalence of 
cal fashions or biases in diagnostic practice. A specific psychiatrist or local 
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psychiatric unit may be predisposed toward the use of certain diagnostic 
categories and the neglect of others. Thus the final diagnosis in any par- 
ticular instance may be a function of the diagnostic prejudices of the par- 
ticular psychiatrist examining the patient rather than a direct function of 
the specific symptomatology present. . . . In surveying the relative inci- 
dence rate for the various neuropsychiatric disorders in numerous Naval 
installations, one is struck by variations which appear to be impossible for 
explanation in terms of a genuine variation in the nature of the samplings 
involved, and seem plausible only in terms of differing local diagnostic 
customs. One of the authors has already pointed out differences of 800 
in the relative incidence of psychoncuroses in random samplings of medi- 
cal surveys from various Naval hospitals. Such differences also appear if 
one examines Naval training station selection figures. If we look at the 
figures for special order discharges from training stations for the month 
of April, 1943, we find that only 30% of the discharges from Great Lakes 
were for constitutional psychopathic state, but 60% of those from Farragut 
fell in this category. The incidence of psychoneurosis among total dis- 
charges at Great Lakes, however, was 24% compared with 10% at Far- 
ragut. . . . Another sampling from the training stations (for the month of 
May, 1945) shows that at this time only 2% of the discharges from Great 
Lakes were for psychoneuroses, while this diagnosis was given in 60% of 
the discharges from San Diego. . . . It does not seem that these differences 
can plausibly be explained wholly in terms of genuine differences in the 


recruit population sampled. Diagnostic preferences must be operating to 
distort the real picture.? 


An elaborate experiment conducted by the British in 1945 yields fur- 
ther evidence on the reliability of psychiatric interviewing.” The same 
125 army officer candidates were examined by two different War Of- 
fice Selection Boards composed of highly experienced staff. In the 
process, a number of different psychiatrists who were members of the 
selection boards conducted independent interviews lasting from twenty 
to sixty minutes and арргаіѕса both the general suitability of the candi- 
date and his specific standing on fourteen to eighteen carefully defined 
traits. While quite high agreement was demonstrated between the 
pooled judgments of the two boards, and between certain pairs of ex- 
aminers, the agreement between psychiatrists was not high. The reli- 
ability co-efficient obtained for the appraisal of general suitability was 
+65, and the median co-efficient for all the traits was only .47. 

Another demonstration, based on a large number of observations but 
only on two interviewers, is available fro 
the RAF during World War II.” 
ever, on a carefull 
sessed the general 


m the psychiatric services of 
This demonstration was based, how- 
y designed experiment, in which each psychiatrist as- 
с с predisposition to break down and the occurrence of 
aits on the i i 

asis of the three-quarter hour interview he conducted 
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with an equivalent half of a total group of approximately 1350 pilots. 
Agreement in the gezeral assessment of predisposition in the sample was 
exceedingly high. However, the specific symptoms recorded were quite 
different for the two psychiatrists. Thus, for example, Psychiatrist I 
found 23 per cent of the pilots “under training" to show morbid fears 
or anxiety, while Psychiatrist П found 39 per cent of his interviewees to 
show such symptoms. 

Studies in the civilian setting have been few, and the observations are 
generally limited in number. ‘But they demonstrate the problem. Ash 
reports data on the reliability of diagnoses for a series of fifty-two pa- 
tents examined at a psychiatric clinic connected with a government 
agency." Independent judgments were made by three psychiatrists, and 
disagreement by major diagnostic categories occurred in at least one- 
third of the cases. 7 

Tn a much larger study, Mehlman reports data on the differences in 
diagnoses assigned to patients in a state mental hospital. Patients were 
allocated in an unbiased fashion to one of a series of psychiatrists for 
diagnosis, Significant differences among psychiatrists were demon- 
strated. Depending on the specific categories studied, the comparisons 
are based on from 597 to 1358 patients examined by from nine to six- 
teen different psychiatrists, making the evidence quite impressive. 

Putative evidence of interviewer differences in psychiatric proce- 
dures is available from a study by Grayson and Tolman in which a 
8roup of thirty-seven clinicians gave their definitions of a series of 
standard terms in common use-? The wide variation in the definitions 
that different clinicians gave to such common terms as "aggression, 

anxiety," "compulsive," and “defense” suggests that there would be 
Considerable unreliability in the application of such terms to actual pa- 
tients, 

Data on invalidity in diagnosis following 
rather than the mere reliability between interv : С 
а study by Masserman and Carmichael of one hundred patients in which 
they found that ‘during only a year of follow-up study a major revision 
1n the diagnosis had to be made in more than 40% of the patients." 

Qualitative evidence of error in psychiatric interviewing is available 
from one study where the actual content of the interview was electri- 
cally transcribed.’ The authors conclude: 


psychiatric examination, 
iewers, is available from 


Even the most proficient note-taker misses critical material. tn Perhaps 
more important in the recording of psychiatric interview data is the influence 
Of conscious and unconscious screening in the therapist himself. The in- 
Coming sensory material often is neither adequately nor completely re- 
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corded. The authors found by comparing memories, notes, and actual Mo 
scriptions that important material often was omitted. At times recorde 
interviews elicited responses of startle and surprise, as though the therapist 
had not previously been in the actual situation and had not previously heard 
the patient’s and his own verbal productions. Omissions, distortions, elabora 
tions, condensations, and other modifications of the data occur, and these all 
contribute to the difficulty of evaluating what really happened. . 

Differences between psychiatrists in the subtle dynamics of their in- 
terviewing behavior, differences that are possibly relevant to the varia- 
tions in results reported earlier, have been demonstrated through the 
application of instruments previously developed to describe social inter- 
action processes.” Using such instruments, Chapple found significant 
differences in the degree of “activity” (ratio of talk to silence) of two 
psychiatrists, each of whom interviewed equivalent samples of 250 pa- 
tients. Similar differences were found within another sample of 40 men 
interviewed by two psychiatrists with respect to an index of “tempo, 
another formal dimension of verbal behavior.?? 

If we turn from psychiatry to the related disciplines of clinical psy- 
chology and counseling, we find a similar state of affairs. In counseling, 
the great concern with the actual nature of the therapeutic procedure 
has led to a series of studies where an accurate description of the entire 
content of the interview is available from electrical recordings. Seeman 
compares the character of the interview technique of the six counselors 
he used with the techniques of counselors employed in an earlier study 
by Snyder and demonstrates that the incidence of given types of behav- 
ior is strikingly different in two studies.” 

Covner, by comparing the counselor’s written report of interviews 
with an electrical transcription, demonstrates that there are large and 
significant omissions of content in the written record, alterations in the 
time sequence of remarks, and lack of precision in the notes, leading to 
ambiguity.” Such findings were conservatively stated, since the coun- 
selor was aware that a transcription was being made and wrote his re- 
port immediately following the interview. (Both these factors are ab- 
sent from normal counseling interviews.) 

Presumptive evidence of differences in counseling behavior is avail- 
able from studies of the attitudes of counselors toward given interview- 
ing practices. Whether these different attitudes carry over into actual 
behavior is, of course, unknown from such studies, McClelland and 
Sinaiko, for example, report that among a group of 


counselors with relatively homogeneous backgrounds th 
erable disagreement on the correctness 


specific interviewing practices on whi 


thirteen expert 
ere was consid- 
of twenty-four of the sixty-four 
ch they were queried.** 
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For another evaluation of clinical interviewing involving the applica- 
tion of a standardized procedure, we again turn to the military situation. 
The work of nine different clinicians who administered approximately 
five hundred Rorschach tests to soldiers in the course of the Aviation 
Psychology Program in World War II was compared. All examiners re- 
ceived the same rigorous course and had the same standardized instruc- 
tions to give to their subjects. While detailed data on other features of 
the responses are not presented, significant differences were observed in 
the average number of responses obtained.” 

In a similar experiment in the civilian setting, a comparison was made 
of the results obtained by fifteen different examiners administering the 
Rorschach to a total of 633 veterans who were patients in a clinic.” The 
subjects were presumably assigned to particular examiners merely on 
the basis of the current work-load, and the assumption is made that ini- 
tial differences in the type of patient seen bya particular examiner could 
not account for the findings. The examiners were a fairly homogeneous 
group, all having been trained in the same methodological approach on 
the Rorschach test. In the aggregate for all examiners, significant differ- 
ences in the results were obtained for a large number of the categories 
Used in scoring the responses. The writer notes, however, that some of 
these differences may be due not to the actual behavior in the interper- 
sonal situation but to the ways in which the scoring system was later 
applied, since each examiner scored his own protocols. : 

‚Опе final study demonstrates how intractable the problem of inter- 
Viewer effects can be. Three clinicians working in close co-operation 
With a given group of children over а period of seven years in the Cali- 
Огпіа Growth studies rated the presence of certain needs. Although 
there was considerable agreement in the ratings of single needs, there 
Were marked differences in the degree to which each clinician found 
Sets of needs co-existing in the subjects. 

It is clear that interviewer effect is a fundamental problem faced by 
all the social sciences which make use of the interview method in the 
Collection of data. It is in no way exclusive to the survey field. But more 
than this, interviewer effects in all these fields have their parallel in the 
errors of observation and measurement ог interpretati n found in other 
Sciences." When we note that there are observer differences in reading 
Chest X-ray films or in interpreting the results of laboratory tests for 
Syphilis or in appraising the malnutrition of children from medical ex- 
aminations or of physicians taking a brief medical history or in rating 
the state of repair of telephone poles or in categorizing short segments 
of observed behavior or in noting the transit of stars in a telescope, we 
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must acknowledge the fact that interviewing is not uniquely vulner- 
able.** | | 

Bertrand Russell’s well-known and penetrating comment on animal 
psychology illustrates the problem:” 


The manner in which animals learn has been much studied in recent years, 
with a great deal of patient observation and experimentation. . . . One i 
say broadly that all the animals that have been carefully observed hav » е- 
haved so as to confirm the philosophy in which the observer believed before 
his observation began. Nay, more they have all displayed the national char- 
acteristics of the observer. Animals studied by Americans rush about fran- 
tically, with an incredible display of hustle and pep, and at last achieve the 
desired result by chance. Animals observed by Germans sit still and think, 
and at last evolve the solution out of their inner consciousness. 


This brief review suggests that one basic issue is simply the magnitude 
of errors in the collection of data by different methods of inquiry, ef- 
ficient ways of estimating their presence in any research, and the safe- 
guards or checks upon such error. Further, it suggests that any funda- 
mental study of interviewer effect in a given field such as survey re- 
Search may make a larger contribution, since the results have relevance 
to the improvement of methods in many scientific fields. 


3. THE EVALUATION OF ERROR—LARGER CONSIDERATIONS" 

The demonstration of error in the interviev 
weighed against the prevalence of error in other 
the collection of data. In addition, whatever crudi 
characterize the method must be weighed in rela 
derived through its employment. Some crudity 
ingly paid in order to obtain essential informati 
sideration furnishes one a 
later findings. 

Murray states this calculation 
tist should orient his research i 
nently pertinent to our proble 
scientific ideal, to cling to th 
will approach in accuracy an 
disciplines, he is doomed to f. 
gation of futile men, of who 


v must not only be 
scientific methods for 
ties and disadvantages 
tion to the gains to be 
may be the price will- 
on. This practical con- 
Ppropriate context for the evaluation of our 


eloquently in discussing how the scien- 
nto personality. His remarks are emi- 
m. "If he continues to hold rigidly to the 
€ hope that the results of his researches 
d elegance the formulations of the exact 
ailure. He will end his days in the congre- 
m the greater number, contractedly with- 


* Much of the ma 


; : resented in a previous publica- 
tion of the project, "Interviewi mei] i 

, ng as a Scientific Procedure,” in D. Lerner and 
H. D. Lasswell, The i 


Bp. 203-16 Policy Sciences (Stanford: Stanford University Press, 1951), 


terial in this section has been 
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drawn from critical issues, measure trifles with sanctimonious preci- 
sion.” And elsewhere in describing his choice of methods, he states: 
“We tried to design methods appropriate to the variables which we 
Wished to measure; in case of doubt, choosing those that crudely re- 
vealed significant things rather than those that precisely revealed insig- 
nificant things. Nothing can be more important than an understanding 
of man’s nature, and if the techniques of other sciences do not bring us 
to it, then so much the worse for them.” 

The interview, by definition, belongs to a class of methods which 
yield subjective data—that is, direct descriptions of the world of expe- 
rience, The interest of many social scientists in the phenomenal world 
calls for such data, no matter how crude the method of collection may 
have to be. For example, three of the most prominent emphases in social 
Psychology today—the emphasis on desires, goals, values, and the like 
by students of personality; the current interest in social perception; and 
emphasis on the concept of attitude—all imply subjective data. While 
Not unique, the interview method has certain advantages for the collec- 
Чоп of such data. EM 

„Methods exploiting other personal documents such as diaries, life 
histories, or letters do yield an elaborate picture of the individual's 
World, his desires, and his attitudes. They have many advantages." 

Owever, these sources are relatively inflexible or inefficient for cer- 
tain scientific problems. They may not exist for the particular popula- 
tion of individuals we need to study, or they may be available only for 
Some self-selected and possibly biased subsample of that population.” 
In addition, such documents may not contain information on particular 
Significant variables, since they are generally spontaneous in origin. It 
I5 true that even total life histories have been commissioned for a par- 
Ucular scientifically selected sample of individuals who were requested 
to cover given areas in the document, but this calls for an act of co- 
Operation far greater than is required for many problems and greater 
than can be required in most instances." In addition, the new applied 
Tole of the social scientist as an adjunct to policy-making requires con- 
“nual fact-finding or research as events occu? or are anticipated, and the 
Interview method in conjunction with sampling is uniquely adapted to 


Such time pressures. р uu 

The self-administered questionnaire method provides subjective re- 
Ports by the respondent and has the advantages of cheapness because of 
the reduction of interviewer costs and the possibility of group adminis- 
tration, plus applicability on a systematic sampling basis. However, it 


has limitations which are not characteristic of the personal interview 
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method. Most obvious is the fact that the interview permits the study of 
illiterates or near-illiterates for whom the written questionnaire is not 
applicable, and this may be an important limitation for studies sorely g 
the national population. So the Research Branch of the Army, whic 
made the most extensive use of self-administered questionnaires, found 
it necessary to interview all classes of recruits with less than fourth- 
grade education.“ А 
Secondly, since it is always possible for the respondent to: reac 
through the entire questionnaire first, or to edit earlier answers in the 
light of later questions, the advantages of saliency questions become du- 
bious, and it is difficult to control the contextual effects of other ques 
tions upon a given answer." Such effects have been found to be sizable." 
In the interview situation, it is obvious that later questions can be hid- 
of the respondent and can have no effect on 
the results of an earlier question. 
gains result from the fact that the interviewer, 
while he might be a biasing agent, might conccivably be an insightful, 
able to make ratings of given character- 
; he might be able to explain or amplify a 
given question, he might probe for clarification of an i 
or elaboration of a cryptic report, or he might be at 
respondent to answer a question that he would 
advantages involving the insightful and resourc 
in the self-administering situation where the mi 


Уе to persuade the 
otherwise skip. All such 
eful interviewer are lost 
stakes of the respondent 


these problems. Infere 


individual from one or another item of behavior. For example, the in- 


tively natural conditions, 
а vertly as in studies involving eavesdrop- 
nformal and unobtrusive 
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have in common an aversion to the subjective, and a reliance on infer- 
ence. 

| While the methods have this advantage, they also have certain limita- 
tons not characteristic of the interview. Great ingenuity is required if 
the investigator is to find appropriate indicators of particular interven- 
ing variables, and errors may well arise in the process of making circui- 
tous inferences about attitude from very remote behavioral indicators. 
Vernon states the limitation well when he remarks: “It is largely owing 
to the indefiniteness of the behavioral content of traits, attitudes and 
interests, that verbal methods have been so extensively developed.” 

How circuitous the inference from behavior can become is easily il- 
lustrated by selecting from the literature such bizarre researches as an 
analysis of subscription figures to the “Nation” as an indicator of radical 
attitudes, or an analysis of the characterization of unmarried women in a 
Sample of novels as an indicator of popular attitudes toward the role of 
Women, or the measurement of sweat secretion as an indicator of the 
Impact of advertisements. 

The informal observation of behavior under natural conditions is gen- 
erally not a flexible method, in that the environment may simply not 
Provide any avenue for the expression of the behavior which is relevant 
to the particular problem, and then a really tremendous act of inference 
15 necessitated. To find out a person's thoughts one must sometimes ask 
him a question! This is axiomatic in the case of studies concerned with 
the past. For example, one of the most lavish governmental social re- 
Search projects in recent years involved the study of the reactions of 
the German and Japanese populations to strategic bombing, but. these 
vestigations were not undertaken until after the end of hostilities. 
It is obvious that the natural setting of the postwar world was not ap- 
Propriate to Observing the reaction to the bombing of three years ear- 
lier. Here it was necessary to reconstruct the past either through the 
Memories of the respondent reported in the course of interviewing or 
through historical records. 

Just as research may be oriented to a past situation which was not 
and cannot now be currently observed, so, too, research may be geared 
to a future and not yet existent situation. People's wishes, plans, desires, 
and anticipations about the future may be central. Here again observa- 
Чоп at some point in time permits only bare inference as to the perspec- 
tive on the future, and it is only through personal documents such as the 
interview that this dimension of man's thought is revealed. 

For other problems, it is theoretically possible to use observational 
methods. If one could wait around indefinitely, the natural environment 
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would ultimately liberate behavior relevant to a given inference. How- 
ever, practical limitations preclude such lengthy procedures. As Vernon 
puts it: “Words are actions in miniature. Hence by the use of questions 
and answers we can obtain information about a vast number of actions 
in a short space of time, the actual observation and measurement of 
which would be impracticable.” 

It should be noted, however, that observational methods were de- 
veloped in a very efficient and massive form in at least two places and 
were found adaptable to a host and constant flux of policy problems of 
an attitudinal sort when handled on a continuing basis. In the United 
States, for a period of years, the Office of War Information operated 
what were known as correspondence panels.” A nationwide network 
of correspondent observers reported periodically on the concerns, re- 
marks, attitudes, etc., of people in their communities. To give focus to 
the reports, these panels received periodic briefings as to what to look 
for in the way of relevant material. Similarly in England, Mass Observa- 
tion’s national panel of voluntary observers provides a wideflung net- 
work of covert observers reporting periodically to headquarters on 
their observations of behavior, conversation, and the like. 

An observational approach to attitudes can sometimes achieve adapt- 
ability by placing the subject in a specially contrived experimental or 
laboratory situation in which the behavior relevant to a given inference 
would appear. Here one can escape the unpleasantness of dealing with 
mere words, and one can study many problems not amenable to obser- 
vation under natural conditions. However, it should be noted that the 
behavior exhibited here is as much bound by the unstated conventions 
of the contrived situation or laboratory, and by the explicit instructions 
which are characteristic of all experiments on humans, as is the verbal 
report by the nature of the formal interview. Moreover, the ability t° 
obtain the participation of ordinary people as experimental subjects 15 
limited. Consequently, generalizations from such procedures may have 
an inadequate sampling basis. 

It should also be noted that the exponents of observation under natu- 
ral conditions neglect to realize that the behavior observed in real life 15 
conditioned by a host of unknown momentary factors operating in the 
environment just as the verbal report of an individual is bound by the 
formal interview situation. In brief, one is always playing some role 10 
relation to some situation—whether the situation be that of the labora- 
tory, the arena of everyday life, or the interview—and the real issue is 
the kind of situation in which the attitudinal findings are liberated an 
the ability to relate the findings to that situation.*? 
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Ns eren d — problems which merely require data that, 
to ee е pjective. Consequently, there need be no recourse 
VS ise as Sven. here the interview method has had widespread 
the United n practical advantages. The decennial censuses of 
presence of Wega еч id aem measure with data as objective as the 
by auere lee р E bing, and such information could be collected 
characteristics Bs. "s of the building. Y et the census enumerates such 
mental pur "de net: iew. Many other interview surveys for govern- 
Mile cc Pos : че been conducted on household possessions, the 
Here um given equipment, the job record of the individual, etc. 
tion or by the бешш the information could be collected by observa- 
in any set of ate ap of records. However, the facts may not exist 
merate a whole an ds, м. it may be less expensive and unwieldy to enu- 
terview. In qu s such needed facts in the course of a single in- 
to other charact = : — enables one to relate the given datum 
simultaneously orem of that sanie individual which can be measured 
gate cantain ^ For example, insurance company records in the aggre- 
any member ace data on every health insurance policy covering 
Such covera : b the population, but they do not permit one to analyze 
penses, fuente in relation to health needs and experiences, medical ex- 
Tecords чен pesa and other significant variables. Similarly, voting 
Not have an : | пе coa behavior of individuals, but the ballot does 
Voter, md id or the social and psychological characteristics of the 
Possible to ан у, beyond a certain gross ecological level, it is im- 
Ployment of alyze the correlates of such behavior merely by the em- 

All of th; such sources. | А | 
terview fine ч that there is an important function which the in- 
tive data seek h performs in the collection of subjective and even objec- 
any findings 1с should not be forgotten In drawing conclusions from 
of course T ae How well the method performs this function 1s, 
tality as is ap aes question. One cannot use the argument of essen- 
™ediable. Jf excuse for perpetuating errors and crudities that are 7e- 
cial in the кй; the reduction of error becomes all the more cru- 
Scientific re nstance of a method that is widely used and essential in 

search. 
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ion of ipm of error is fraught with complications. The demon- 
nst the sual in social research interviewing should be weighed 
Priate startin ence of error in other fields of interviewing; the ap- 

g point being that we deal with a universal problem. 
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The damaging effect of error in the interview should further be 
weighed against the fact that the method provides easy—and possibly 
unique—access to comprehensive data on realms of experience which 
are important topics for scientific study. But the complexity is further 
multiplied! As we seek to apply our specific findings on error to the 
general betterment of interviewing within social research, we must in- 
terpret the nature of error broadly. Otherwise we shall evaluate the 
problem badly. The very concept of error requires discussion and clari- 
fication. 

If interviewer error were unitary and easy to determine, there would 
be no need for such discussion, but this is not the case. Error is of two 
major types and, in certain instances in social research, very difficult to 
measure. In social research, the measuring instrument is the interviewer. 


We use many such instruments for a large-scale survey and our aim 15 
r o 


to insure that the instruments are reliable—that the results do not 
change with the accident of which particular interviewer is employed. 
In so far as there occurs inter-interviewer 


variation, different interview- 
ers obtaining v 


ariable results when applied to the same or equivalent 
respondents, our over-all measurements are subject to one type of error, 
Which it would be desirable to estimate or reduce. Moreover, in the 
usual survey, since interviewers are frequently assigned to different 
types of respondents, such variation in their behavior reduces our ability 
to establish functional relations between variables, leading to general 
laws, since uncontrolled factors present in one interview and absent in 
another might obscure or distort the relationships." 

While variation between interviewers is a very legitimate aspect of 


it does not exhaust the nature of error 
ot interviewers differ in the results they 
m of whether any or all of them obtain 
proximate some true value. 

on of inter-interviewer variation and an 


n | the results must always be kept in mind. 
ile this would seem obvious, there are circumstances that readily 
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purely in terms of the restricted concept of error as being synonymous 
with inter-interviewer variation. Thus, one might well institute a cer- 
tain procedure which has been shown to reduce variation among inter- 
viewers at the expense of some loss in the validity of the aggregate re- 
sults, Thus, in later chapters, we discuss the reduction in inter-inter- 
viewer variation that accompanies the use of certain types of questions. 
However, in so far as such questions are inadequate to the revelation of 
Certain attitudes or certain dimensions of attitude, one must balance the 
gain in reliability against the loss in validity in the answers of given 
respondents, and one would seek some compromise or optimal solution. 
Evaluations oriented purely to the reliability problem also run the 
danger of conservatism because the standard against which any inter- 
Viewer's performance is appraised is that of another current interviewer, 
or that of all current interviewers. Since our discipline over interviewers 
15 bound to have some small effect, we consequently rule out as a norm 
any aberrant, radical forms of interviewing that are outside of our cur- 
Tent practice, We ultimately approximate to a uniform and smoothly 
Operating staff all engaged in the best current practice, but perhaps far 
from ideal practice. It is only as we have as a norm a form of interview- 
ing that approximates close to valid results, that we become radical and 
eXperimental. It must be the neglect of this latter concept of interviewer 
ae that accounts for the rarity of innovation. Note how bizarre Kin- 
У $ Cross-examination approach to the research interview appeared to 
4s in social research or how recent it is that public opinion workers have 
“gun to exploit the procedure of group-interviewing of a number of 
Tespondents, Why has no one emphasized the reverse, having the single 
*spondent interviewed by a group of interviewers?® The lack of em- 
Phasis on the validity aspect of error has led to orthodoxy in proce- 
ures, 
К... Problem of gross effects on the validity of results vin s 
Án into context in evaluating our later findings. Our aui ty lies, 
тай н, in determining the presence of gross effect or invalidity. Cer- 
e ape ade e aer duos eel d 
teristic, such as ag : fr i E i r some item of future or past 
кене " 5 age or ormal e ucation, о eer DON. 
stance, b ch as voting in an election or cas hing a E А d E 
tain си 15 easy to define a true value, and theoretica у possible to ob- 
Boe m data against which to evaluate interviewer error. How- 
eritegion à in such instances, the practical problems of obtaining such 
8 "ova have limited the study of gross effects and led to all sorts 
ximations for criteria. 
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But what of the problem in surveys of an opinion or apes гуре 
surveys, for example, concerned with such matters as the pu D E $ E 
eral sentiments about Russia or taxation or socialized medicine? Un er 
such conditions, the direct estimation of gross effects is complex, ii 
there is little or no agreement on the nature of "attitude, and posed 
quently a criterion may neither be accepted nor even exist. Siue! ees 
the objective of such a survey was specifically defined in ter “ү : d 
particular social situation within which such opinions -— Adan 
pressed or acted upon, the problem would logically not be di 7 
from that of the factual survey. It is in this direction of greater speci 
cation of the situational setting of opinions that one might easily solve 
some of the problems of validating opinion surveys and also approxi- 
mate to greater validity of interviewing procedures. One would then 
aim to simulate within the narrow environment of the interview the 
very conditions that characterize the larger situation.” Пашан E 
it is most rare to find a study which is so precise as to concern itself, à 
example, with the opinions of Negroes about discrimination, as thes 
would be expressed in a Negro-white social setting or in the context 0 
immediate reactions on specific Army policies in World War II. Gener 
ally, opinion surveys concern themselves with the general structure e 
sentiments in a given area; these sentiments being regarded as interna: 
states underlying but different from behavior." 

How then shall we decide that our interviewers are obtaining truth- 
ful and adequate reports from respondents of their inner feelings? Apart 
from traditional procedures of accepting the appraisal of some judge as 
a criterion, we ultimately decide that certain reports are more valid rep- 
resentations of inner states than others, or rather we decide that descrip- 
tions given under particular conditions are bound to be more valid. In 
the end analysis, such decisions are predicated on some model or con- 
ception of the nature of attitudes and upon some theorizing as to the 
nature of the interviewing procedure under which attitudes are best re- 
vealed. Such models obviously function as criteria for evaluating the 
validity component of interviewing error. A moment’s reflection con- 
vinces us of this fact. Why is rapport almost universally accepted as 987 
sential to a good interview, and why is the interviewer who obtains 
more of it regarded as better? Simply because of the assumption that 
people talk better in a warm, friendly atmosphere, and the additional 
assumption that attitudes are somehow complex and hidden and a lot 
of talking is essential before the attitude is elicited. Why is probing re- 
garded as desirable in attitude research? Because of the conception of 
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attitude as many-faceted, equivocal, subject to qualification and shading 
and the like, and the conclusion therefrom that a simple initial answer 
cannot convey the total structure. 

Why do we generally regard an interviewer who obtains a great 
many “don’t know” responses as bad? It is because of the simple as- 
‘si ad that people have beliefs about most everything, and the corol- 

y view that the interviewer who does not elicit the answers must be 
doing something wrong. 

With many such specifics, there is no problem. They would be ac- 
cepted by reasonable people. Probably no one would contest the fact 
poe. iettstdewer should not provide the answer himself, since the at- 
ЕП. Amel sacle, whatever its real nature, is the property of the respond- 
Tation af ever, as a general problem, we must turn to the critical € 
Bills such models, since they underlie the evaluation of our specific 
Whi gs and affect the larger question of improvement of interviewing. 

hile we cannot hope to establish the definitive model for attitudes or 
Opinions, we can modify certain extreme past views in the light of rea- 
tid Mae particularly, "we can examine whether past theorizing about 
E procedures most appropriate for the n of oA 
will b nowsoever defined, has been adequate to the total pro lem. lt 
ing € evident upon such examination that many suggested intere - 
ше either bear little logical relationship to the pese. pro Я 
bue T cope with the problem of validity to the neglect р 
validity X. We gain little if we adopt neis 94 bip мна 
crease in ae from a given respondent at the аре : асал 
nine r-interviewer variation. Reliability must not be sa 

Social research. 
ah oper nci of hese deine ree Se pan 
S nitus deer we b: m xem У “characteristic of de- 
Velopments of į er variation has bec p i Же pirra S| 
Pres s of interviewing methodology for research p P m 
havin y stem from the clinical fields. There, the elements 0 28 Ps : 
m kd to da with the uniqueness of the individual case and the depth 
ене mplexity of mental processes, plus the traditional уз өү to 

ine te n than the collection of comparable perd ra vrl 
нене ы ae procedure of a sim ae ied e itg 
entis studs eme bean Oe n x uut rapport. For the 
га че th e dph иаи i re "i of some re- 

Spondents is i grant a gain in validity in the rep 
. However, it is obvious that the absence of some form of 
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standardization may well lead to greater inter-interviewer variation, and 
the neglect of this problem in certain writings makes one question the 
over-all wisdom of the recommendation. 

Occasionally there is also a certain dogmatism about such extreme 
statements, which makes one pause. They seem too certain of their 
conception of the phenomenon under study, of the procedure that is 
best, and too convinced of the skill of the fieldworker. One can adopt 
the position that freedom gives play 
his judgment and insight and that o 
straitjacket of specific rules of proc 
interview no more skilfully 
one must also keep in mind 
limited, and that there is g 
which particular interviewe 
genius, 


Such views may also go too far in emphasizing the requirement of 
rapport. Interviewers can be encouraged to the point of great chummi- 
ness with the respondent. While friendliness is fine, and rapport impor- 
tant, a certain degree of formality may be superior to maximum rapport. 
Where the relationship is too warm and intimate, the respondent may 
react excessively to the interviewer. The materials in Chapter II illus- 
trate this danger well. 

In addition, while one must also grant that there is complexity in so- 
Cial attitudes, certainly the truth does not always lie in the tortuous, 
complex, hidden process. One can go too far in postulating such a model 
in social research. In the deserved popularity of such conceptions, one 
can vulgarize them. The belief prevails too widely that the richer and 
deeper and lengthier the remarks of the respondent, the more likely is 
this to be the genuine picture of the attitude, Interviewers are encour- 
aged to keep probing and to question the validity of a thin answer. Cer- 
tainly there is much truth in this point of view, and we may miss the 
full complexity ofa deep, tortuous attitude structure in 
ent by not pursuing the answer far enough. 
distort the situation just as mu 
in this world with no hidden 


for the skilled worker to exercise 
ne should not put a Freud into a 
edure which would allow him to 
than the most mediocre worker. However, 
that the number of Freuds in our midst is 
rave difficulty in determining in advance 
rs should be given freedom to exercise their 


Murray remarks on the dangers of such extreme views in discussing 
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the proper balancing of emphasis on the manifest and the latent in 
personality research.” 


A psycho-analytic case history seldom portrays the patient as an imagina- 
ble social animal. Even in describing normal people the psycho-analysts put 
emphasis upon the aberrant or neurotic features, because these are the things 
which the practice of their calling has trained them to observe. It is as if in 
giving an account of the United States a man wrote at length about accidents, 
epidemics, crime, prostitution, insurgent minorities, radical literary coteries 
and obscure religious sects and made no mention of established institutions: 
the President, Congress and the Supreme Court. 


Such categoricalness about the model of the phenomenon and the 
model procedure, as well as an unbalanced emphasis on the validity 
Component of the larger problem, can be illustrated in a quotation from 
Woodside. In suggesting what is proper interviewing procedure for re- 
search inquiries into sexual behavior and fertility problems, she states: 


As most of us know, while the itemized questionnaire or the doorstep in- 
terview may be adequate to obtain information on such things as—say— 
individual preference for radio programmes or breakfast foods, these meth- 
ods are totally unsuited where the questions touch on involved personal and 
emotional reactions, inevitably associated with sexual and contraceptive be- 

avior.?? 


The assumptions underlying this specific model are of the set 
order previously described and can be explicated from other portions o 
the text. The depth character of the processes is revealed in: 


, There is more to it than this, when you are dealing with a subject as emo- 
tionally charged as sex. The interviewer needs to know something of peo- 
Ple, and to have an awareness of psychological mechanisms such as e 
ivalence, repression, rationalization, when he encounters them not in ч 
text-book but in the individual. . . . Though one's subject cooperates In à 
good faith, he or she may be unable to free themselves of the inhibitions 
arising from their own inner conflicts . . . or escape from giving the ap- 
Proved answers imposed by outer cultural standards. 


The emphasis on uniqueness of the respondent, on the requirement of 


Warmth of ra skill of the investigator is seen in: 
pport, and on the 5 


/ 7 i or anonymous 
Alw ays we have to remember that they are not ciphers y 


“subjects,” but they are human beings, each with individual personality 
make-up and an individual life situation. If we want them to talk to us, to 
Teveal something more of themselves and their attitudes than appears on 
census sheets, we have first of all to be sincere ourselves, sincerely interested 
in them as persons, yet at the same time being alert to their reactions and 
their interview behavior. . . . We will probably only get the information 
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we want by allowing and even encouraging our “subject” to talk in what 
may seem an irrelevant manner about himself. The experienced observer 
sometimes picks up his most important clues from a chance remark, 

As Murray implied, such extreme conceptualizations are bound to 
distort the phenomenon and reduce validity. Kinsey erred similarly, but 
on a limited aspect of the problem, when he started from the assumption 
that false reports from his respondents would tend always to reduce the 
correct estimates of sexual behavior, and not to inflate them. He then 
designed his interviewing methodology in this light, but in this instance, 
it can even be shown by analysis of 
unwarranted." 

That the phenomenon, attitudes about sex, is inevitably associated 
with involved, emotional reactions and totally unsuited to the straight- 
forward, standardized research inquiry seems questionable simply on the 
axiomatic ground that people differ and there are some people some- 
where for whom simple questions under standardized conditions would 
be adequate. Furthermore, the empirical evidence of many past inquiries 
of a quantitative sort also calls into question such a view. We need only 
look at sexual inquiries in the United States, Puerto Rico, or England to 
note that relatively standardized procedures at the least cannot be totally 
unsuited to the problem. Thus Mass Observation in commenting on a 
survey of sex attitudes in Great Britain remarked that, “In this survey, 
as was the case with that on birth control, many people stopped at 
random in the street were eager to talk to perfect strangers who they 
were not likely to see again.”®? 

| Similarly, Finger, who conducted an inquiry into sex beliefs and prac- 
tices among 138 unmarried male students via a stand 
naire administered under careful conditions, remarks: 


his own data that the assumption is 


ardized question- 


The nature of the responses at least suggests general lack of inhibition in 
answering. . . . The reliability figures leave little to be desired, if they can 
be taken at face value... , One is tempted to compare the figures obtained 
in this study with those resulting from interview studies of other popula- 
tions. . . . The findings of approximately 93% masturbators checks reason- 
ably well with Ramsey’s, Kinsey’s, Hamilton’s, and Merill's. . , . Ramsey 
found 30% of 17 year-olds reporting homosexual experience, while the pres- 


ent study reveals 27%, Approximate agreement is found in most of the 
other comparable items,” 
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Finally, one must note in the illustration from Woodside that the 
problem of reliability is completely neglected. Admittedly, she is speak- 
ing of the small-scale, qualitative inquiry; nevertheless, there is still some 
comparability. 

It is axiomatic that no model of an extreme nature can be regarded as 
generally ideal. The nature of attitudes, apart from formal definition of 
the concept, will vary with the subject matter under study. Some will 
be affect laden, others not. Some will be deep and tortuous, others super- 
ficial. The same attitude will vary in its character in given cultural and 
sub-cultural settings. The purposes and conditions of social research are 
So various that we must be flexible in our conception of what is appro- 
priate interviewing methodology. More than this, any model procedure 
must somehow compromise between the requirements of reliability and 
validity. 

Apart from such logical considerations, one questions the authority of 
Most traditional conceptions of proper interviewing procedure, when 
one notes the wide variation in the recommendations of different investi- 
gators on the same problem. Where there is so much disagreement, one 
might well be tentative in his views. The lack of consensus can be dem- 
onstrated for an earlier era from a study by Cavan." She tabulated the 
Suggestions in the literature of the twenties as to the proper interview- 
ing procedures in gathering life history materials. Some of the results 
are reproduced below in Table 1, and indicate that past consensus is so 
Poor that such conceptions in totality afford little guidance. 


TABLE 1 
How то HANDLE THE INTERVIEW 


(The Conceptions of Thirty-eight Different Investigators) 


No. of Times 
Mentioned 


Control of the interview: 


Provide ample time and appearance of leisure. .....---- oos 
Interviewer should control the interview and adapt it to the particular case 6 


Explain the purpose of the interview to interviewee MEME TETUR : 
Make appointment with the interviewee ahead of time | 
Keep the interview to the main issue... 
Comfort of interviewee: 

Use informal and natural manner, tact. . . 4 
Avoid distractions 2 
Make interview agreeable and entertaining 2 
Avoid fatiguing interviewee. ...........0+ +++ 1 
Put interviewee at ease. L.L. ee 1 
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No. of Times 
Mentioned 


Making friendly contact, identifying oneself with the interviewee: 


Giving interviewee confidence: 


Give interviewee fecling of security, “transference” in psychoanalysis... 2 
Promise confidential use of material from the interviewee............ 2 


Securing spontaneous response: 


3 
7 
2 
1 
1 
1 
To secure veracity, avoid leading questions or suggestions............... 5 
To overcome inhibitions: 
Use another APPEO CN cae cas кыз кырыы карызын Hid wear Жон cava DEL 1 
Speak of experiences the interviewee might havehad................,, 1 
Incentives to induce interviewee to talk: 
Flatter interviewee, “his experience is unique," "only the best in his pro- 
fession are being interviewed,” егс............................... 4 
Appeal to pride, vanity, through giving him a part in a rescarch projct.. 2 
Appeal to interviewee’s desire to help others, that his experiences will 
help others 


Let interviewee feel he is leading the interview... . 
Promise that no punishment will follow the interview 


behavior of many inter- 
f inter-interviewer vari- 
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ation. Yet one senses that in specific instances, some of the suggestions 
are mere “common-sense” opinions, or that they are presented at too 
concrete a level of description. They are too categorical and not befit- 
ting the wide variety of situations and phenomena under research study. 
For example, it is not clear that “occasional physical contact" is neces- 
sarily a good means of achieving the larger goal of “friendly relations.” 
It might well be undesirable for circumstances involving a male inter- 
viewer with a strange and reserved female respondent! Nor is it clear 
what ultimate end in terms of data the goal of friendly relations serves 
and exactly how well it serves that end. 

Therefore, while concrete prescriptions serve to standardize proce- 
dure, they may suffer from too great specificity in relation to the wide 
variety of interview problems. Some resolution of this dilemma is re- 
quired and can be found in providing concrete rules and also in provid- 
ing some larger framework of principles which allows for altering the 
rules under given circumstances. Thus, for example, Roethlisberger and 
Dickson in the course of their classic investigation of industrial workers 
developed an elaborate interviewing method.” They make a significant 
distinction between “rules of orientation” and “rules for conducting the 
interview.” 

The rules of orientation embody a conception of the nature of atti- 
tudes plus a theory of the interview as a social situation affecting the 
adequate expression of attitude. These rules are intended as a general 
framework of principles to guide the interviewer's specific behavior. 
The rules of conduct, by contrast, involve very concrete suggestions for 
the behavior in which the interviewer should engage to elicit valid in- 
formation. 

By this distinction, Roethlisberger and Dickson are suggesting that the 
concrete behaviors or performance of the interviewer may well change 
with given circumstances, and that the real measure of the goodness of 
à procedure is its appropriateness to some larger objective. They re- 
mark: 

a secondary role to the rules of 
s what he is doing and is in active 
latitude in what he can do. 
t is not of first importance. 
lves to the situation. 


The rules of performance should play 
orientation. If the interviewer understand 
touch with the actual situation, he has extreme 
Whether or not the interviewee faces the ligh 
++. The rules of performance must address themse 


While the general logic of the Roethlisberger approach is impeccable 
—a set of procedures that are concrete and yet flexible and derived from 


some larger conception of the phenomenon—here again one senses a 
slightly disproportionate emphasis in the model of attitude advanced. 
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While these authors caution against complete disbelief about the mani- 
fest remarks of a respondent, they, too, suggest an identity of the deeper 
with the more genuine. “The interviewer would not have been misled 
by the manifest content of the statement"; “It is necessary to treat in- 
dividual responses as symptoms, rather than as realities or facts, of the 
personal situation which gradually is disclosed as the interview pro- 
gresses"; "Most omissions that occur in an interview involve not only 
things about which the speaker does not wish to talk but also things 
which lie so implicitly in his thinking that they have not yet become 
conscious discriminations." This excessive emphasis upon the hidden 
subtleties of attitude leads them to give the interviewer great freedom 
to exercise his judgment with consequent danger of error. 

In developing a model interviewing procedure, one must somehow 
balance the gains in reduction of inter-interviewer variability that come 
from standardization against the possible loss of validity due to the in- 
flexibility of the pro 


cedures for the range of circumstances, the con- 
straints placed u 


pon the interviewer's insight, and the loss of informality. 
One can array various approaches in the literature along the continuum 
of the freedom allowed the interviewer. Depending on the position on 
this continuum, one notes that the validity component has presumably 
been maximized through the exercise of great freedom in interviewing, 
or that the reliability component has been maximized through stand- 
ardization of procedure. One can also note whether or not alternative 
procedures are developed to treat whichever component has been neg- 
lected. Thus, Kinsey made a choice in some degree like that of Wood- 
side. He recognized that an interviewer given freedom to conduct an 
inquiry in his own way might well use a biased wording or order of 
questions, or that two interviewers might at least exercise their freedom 
in different ways and thus make the data noncomparable. He also real- 
ized that verbatim recording of the answers was not subject to inter- 
viewer bias in coding, and that subsequent coding in the office could be 
more standardized and would permit easy checks of reliability. Never- 
theless, the interviewers were given no standard 

order of questions, on the grounds that the insig 


+ Thus Kinsey sought 
t the price of a possible loss in reliability. 


E 
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Yet, there was not complete neglect of the problem of inter-inter- 
viewer variation. The lack of procedural standardization was presuma- 
bly compensated for by the development of long and intensive training 
of the small crew of interviewers, testing of them in advance of field 
work to determine the agreement in their coding behavior, and ulti- 
mately by the application of empirical tests of agreement in their col- 
lected data. 

Hamilton's decision, although he was working in the same area of 
human sexual behavior represents a complete contrast with Woodside's 
or Kinsey's approach." He recognized not only the possibility that the 
interviewer might use a biased wording and order of questions, but that 
even minor changes in inflection from interview to interview would 
jeopardize the comparability of the data. More than this, he believed 
that the distance in feet and inches between interviewer and respondent 
and the position of the respondent vis-à-vis the interviewer could affect 
the results, Consequently, each question was printed on a little card, and 
the interviewer merely handed it over to the respondent who was seated 
in a chair roped to the floor at an exact and unchangeable distance from 
the interviewer. Here we insure comparability, but the interviewer can- 
not make his full contribution. And it is possible that the extraordinary 
safeguards of reliability might well operate to make the general situation 
So bizarre that any gains deriving from an informal chat in a homey 
atmosphere are also lost. : : 

As one contemplates these contrasted studies, it might appear as if one 
Were driven to the unpleasant choice between interview data that are 
completely reliable but also completely sterile as contrasted with inter- 
view data potentially full of validity but with a high order of unrelia- 
bility, Actually, the choice is not this difficult. Under certain circum- 
stances, it is possible to have maximally flexible procedures and to 
Approximate some degree of reliability by elaborate training and selec- 
tion of personnel. Such is the possibility in a study with a small field 
Staff and long operating schedule, as was the case with the Kinsey re- 
Port. In other instances, where research involves a massive staff, one can 
adopt reasonable procedures which involve considerable standardization 
and yet flexibility within a framework of general principles. In public 
pinion research, ideally one notes such an orientation to the validity 
and reliability problems. The order of questions and their specific word- 
Ings are standardized, but the interviewer is permitted to make certain 
innocuous changes in the procedure to suit the needs of the respondent 
—such as repeating the question, stressing a word that was not attended 
to, or introducing the question with some parenthetical remark which 
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might clarify some element of confusion, He is also instructed to probe 
beyond the initial answer so as to clarify ambiguous answers, to provide 
an elaboration upon an inadequate report, or to show the reasoning be- 
hind the attitude. Training in nondirective, i.e., unbiased, probing is 
provided for the interviewer, and written instructions in advance of the 
given survey provide a list of *don't's" and also a uniform interpretation 


of the questions, objectives, and procedure for the interviewing staff so 
as to maintain reliability. 


the interviewer to probe in the proper place, to be insightful, to sense a 
distortion, and the like, one can develop Systematic procedures to deal 
with these problems. The great mistake of those who advocate the ex- 
treme in freedom is to identify the solution of these aspects of attitude 
Ineasurement solely with the interv 
ing is only one small part of a larger system, which includes research 
and questionnaire desig 
to elicit real attitudes, one does not entrust it entirely 
the interviewer, One can standardize the interviewer's behavior and rely 


the pretesting of the questionnaire to determine empirically whether 
rapport has been gained. Thus, what might appear to have been lost 
through the constraint upon the interviewer is regained through system- 
atic exploitation of some other feature of the research process. The 
practice of obtaining interviewer report forms wherein the interviewer 
comments on the motivation, interest, hostilities, etc., of the respondents 
gives the analyst the benefit of the interviewer's insights, without their 
biasing the actual field data to the point where respondent’s report and 
interviewer's insight are inextricably mixed." Here again, what is ap- 
parently lost in one Phase of the research process is regained in another 
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This false location of such problems in the interviewer's realm is il- 
lustrated clearly in Roethlisberger and Dickson’s account. Their con- 
Ception of attitudes, as indicated earlier, is that of deeper and complex 
structures. But they place the full burden of treating this complexity 
upon the interviewer. They promulgate, as one rule of orientation, that 
“the interviewer should not treat everything that is said as being at the 
same psychological level.” Let us grant the conception of levels of func- 
tioning, but let the analyst treat of this problem systematically, rather 
than the interviewer. Are there not devices for the analyst to discrimi- 
nate the conviction from the lightly held attitude, the self-deception 
from the real? They postulate another rule which suggests that the 
interviewer should treat the responses as indices with some deeper per- 
sonal meaning. Does not this admonition apply equally, if not better, to 
the analyst? Then if one developed systematic research designs to cope 
with such problems of attitude measurements, one could constrain the 


interviewer without any loss in validity. 


5. THE EVALUATION OF INTERVIEWER ERROR— 
THE ULTIMATE PERSPECTIVE 


Our aim in these introductory sections has been to provide a broad 
Perspective on the problem of interviewer effects. We have suggested 
that such error needs to be evaluated in relation to other methods and 
Must be balanced against many other considerations. But nowhere have 
We raised the ultimate consideration that interviewing—good or bad—is 
Only one of the problems requiring methodological consideration in so- 
Cial research. This study concentrates on interviewing and treats it at 
reat length because of its complexity. However, it would be a great 
Mistake if the exclusive focus of this report were to be matched by 
exclusive attention to problems of interviewing. The problem must 
come into prominence, but so must other problems of theory and 
method if we are to make rcal advances. It was in this spirit that two 
related projects were commissioned by the Social Science Research 
Council to parallel ours. Those reports read in conjunction with this 
Provide a far more rounded view of current methodological problems. 


CHAPTER II 


The Definition of the Interview Situation 


l. QUALITATIVE DATA ON THE DEFINITION 
OF THE INTERVIEW SITUATION 


All research into the nature of i 
model or image of the interview 
terview directs us to study certai 
significant features of the inter 
because our image or model fail 
the model is obviously of great i 
we turn to the explicit or impli 
have no assurance of wisdom on 
based on too narrow a concepti 
view. 


Thus, if the influent! 


nterviewer effects is guided by some 
situation. A particular image of the in- 
n features as the sources of error; other 
view may never be examined simply 
5 to recognize them. The adequacy of 
mportance. How shall it be derived? If 
cit model of an earlier investigator, we 
his part. His model may well have been 
on or a wholly false view of the inter- 


ial writing of Simmel or the texts of Park and 
Burgess and other leading sociologists are our guides, our attention will 


be directed to one important aspect of the interview; we will see it as a 
“circular response,” in which “there is stimulus and response, with every 
response becoming a stimulus for another response (and) interviewer 
and interviewee generally stimulate each other in new ways as the inter- 
view proceeds step by step." But such a conception may lead us to neg- 
lect noninteractional sources of effect, such as an interviewer’s lack of 
skill in recording quickly or accurately, Or it may cause us to overlook 
the residues of earlier interactions, such as persistent autistic influences 
on the interviewer’s perceptions, or the effects of the sponsorship of the 
inquiry upon all of the respondent’s answers, in favor of observing the 
minor dynamic process of question and answer. 


If we turn instead to the classic study by Rice of “Contagious Bias in 
the Interview,"? we are informed b 


pretation that “this bias was i 
consciously, to the interviewed, and 
by the summary description that “ 
of investigators’ individu 
tortion in replies given by the latter 


an inquiry . . , disclosed a transfer 
and a corresponding dis- 


to scheduled questions.’ Here we 
viewer effects 
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particular explanation of them. The findings reported are perfectly 
compatible with the notion that the interviewer simply distorted the 
recording of given answers in accordance with his own prejudice, or 
that he interpreted ambiguous answers in autistic ways. It is Rice’s con- 
ception of the nature of an interview that forces his explanation. 

Wisdom would dictate that our conception of the interview—funda- 
mental to our entire program of research—be predicated on some sound 
basis. And when we consider the origin of earlier conceptions of the 
interview, we realize that they represent essentially a priori views based 
on some particular social science orientation. They may have little em- 
Pirical basis; and more, they may not even stand up to logical examina- 
uon. Thus the Young or Bogardus view conveys the notion of reci- 
Procity between respondent and interviewer—hardly an appropriate 
description of a situation in which one of the parties is often an “aggres- 
Sor" with a prepared course of action and a definite goal while the other 
is an unprepared "victim." Rice's view suggests that the respondent is 
keenly oriented to the mental processes of the interviewer hardly in 
accord with the common experience of the survey interviewer, who 
finds many respondents completely detached or apathetic and answer- 
Ing questions in the most perfunctory way. 

Winds of doctrine in social science may well be responsible for en- 
throning an oversimplified view of the interview, which in turn is the 
basis for research into interviewer effect and its control, but which sadly 
neglects many important factors. Thus, perhaps the most influential 
Point of view about interviewer effect in public opinion research has 
been that the interviewer's own opinion or ideology is the most decisive 
factor. A detailed study of this particular factor is given prominence in 
a classic work on methodology in public opinion research, and an ele- 
Sant mathematical proof is accordingly presented that the best solution 
to the problem of bias is a proper balancing of the ideological composi- 
Чоп of the field staff.’ Dedicated to the control of interviewer bias in its 
election surveys, the American Institute of Public Opinion followed the 
lead of such studies and attempted to balance the political structure of 
Its staff in its 1948 surveys. 

But such a mathematical proof and such an administrative procedure 
have relevance only on the assumption that the primary source of bias 
lies In the interviewer's ideology. It may be, for example, that the inter- 
комы ideology is far less important in producing bias than his beliefs 

€ true sentiments of the population. If this were so, one might 
ave used a 1948 staff which was perfectly balanced ideologically, but 
Which would nevertheless have biased the results because of the wide- 
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spread belief that Dewey would win in a landslide. A letter from one 
interviewer after the 1948 election implied this possibility: “The last 
political poll I did October 25 was overwhelmingly for Truman. I 
didn’t feel entirely satisfied when I sent my work in. I felt that perhaps 
I hadn't filled my quota properly.” 

Consideration of such a plausible source of bias—the interviewer's be- 
liefs about the opinions of his respondent—seems to have been wholly 
neglected in more than a decade of methodological work on the prob- 
lem. Why, when it is so obvious? Must it not be because we remained 
blind to the obvious so long as we stuck narrowly to our preconcep- 
tions? And these preconceptions about ideological factors operating 
within the interview possibly received prominence because they were 
part of a one-sided theoretical emphasis on motivational constructs. We 
overemphasized the interviewer's motivation to alter the results, the in- 
fluence of his wishes on his perceptions, and the respondent’s motivation 
to conform to the interviewer's opinions. Cognitive factors in the inter- 
viewer deriving from other sources, such as his belief about the respond- 
ent’s true sentiments, were not noticed because such concepts were less 
prominent in influential bodies of theory. Prevailing theories and con- 
ceptions of the interview must be at least temporarily suspended while 
We go about examining the situation in its true complexity. Lundberg 
rightly remarks in discussing the Interview Method that “it is not possi- 
ble here to enter into a detailed consideration of the intricate interstimu- 
lation and response which are the structure and content of the interview. 
The fact is that there are very few scientific data available on the sub- 
ject, although research in this field lies at the very foundation of soci- 
ology.” A sound conception of the interview, which in turn would 
guide future research on interviewer effects into appropriate directions, 
would seem best achieved through empirical study. Then we might 
check whether the interview actually conforms to our preconception 
of it, and broaden our views, where necessary, to accord with reality. 

Such an approach has been the starting point for much of our experi- 
mental work on interviewer effect, With many fragments of data ob- 
tained by a variety of means, we have tried to reconstruct at least a 
portion of what actually goes on in the survey interview. However, we 
have been less interested in the overt actions within the interview and 
concerned more with subtle implicit processes going on in the minds of 
interviewer and respondent. We have sought an account of the inter- 
view as it appears to the individuals experiencing the situation, on the 
assumption that it is the way the situation is defined to the respective 
parties which is most important. There may be significant aspects of the 
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interview which are not readily observed—private experiences which 
the individuals will not or cannot articulate to us, behavior of which 
the parties are not aware. These realms unfortunately are inaccessible to 
our methods, but we shall gain considerable knowledge of the situation. 
MacLeod has stressed how much phenomenological inquiry revolution- 
ized research on perception.* So too, a phenomenology of the interview 
may radically change research on interviewer effects and even the 
broader field of survey research. 

Toward a description of the interview, we now present the fragmen- 
tary beginnings. We shall examine several “case histories"? of interview 
Situations and see what leads they can furnish us in our research, what 
alterations they require in the traditional conceptual scheme. Systematic 

‘cussion of principles and presentation of quantitative evidence of 
their Operation will be postponed until Section 2 of this chapter, and in 
later Chapters experimental evidence of the biasing effects of these phe- 
Domena will be presented. The reader is referred to Appendix A for a 
detailed report of the methods used in collecting these data. It is suffi- 
Clent here to state that, in each case, the interviewer was asked about his 
(or her) experiences and reactions directly following the interview, and 
that the respondent’s description of the same interview situation was 
obtained through a special interview conducted a few days later. 


Detachment of Respondent and Interviewer from 
the Social Impact of the Interview 


The first case reveals an interview situation in which the interviewer 


¢fined the respondent as “a creep” toward whom she felt intensely 
NOstile, 


s The Woman interviewer, in describing her feelings about the male re- 
senden, remarks: “I just didn't trust the guy.” Later she adds the i 
а He made me creep.” When asked what movies she ил сч 
ге Pondent Preferred, she suggested “something sadistic.” Her image of the 
Spondent Was that of an unscrupulous, untrustworthy person, as evidenced 
ee Statement: “When I came to the a i ake placa i 
есап< was occupying a home in a veteran 5 housing a í € M 
n use T doubt very much that he is a G.I.” This general attitu ist 
un цр With respect to his personality, but also with = ^ the specs 
bes he gave. In answer to the question as to whether Li elt mun o; 
tena by any of the respondent’s opinions, she said: ер пеш all o 
that er the exception of his statement on why he was a liberal, but even 
Mistrusted.” 
said while all this is going on in the mind of the interviewer, the respond- 
"he we of the interviewer is that of a pleasant, polite, attractive person, 
1€ answers that she was “suitable to my idea of an interviewer.” When 
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asked about it, he says “Га like to know her better" and adds “I wouldn't 
mind to discuss a few things with her.” The respondent even thought the 
interviewer “liked” him and added very cautiously, “I had such a feeling, I 
don’t know why.” 

Despite the intense hostility on the part of the interviewer there was none 
expressed by the respondent. The only suggestion of any disturbing element 
for the respondent is given in answer to the question, “Did you have the 
feeling that the interviewer was surprised at any of your answers?” He said, 
“Yes, she was surprised that I didn’t know about the Better Business Con- 
trol. I hope you don't fire her on this account.” Apart from this, there are 
no overt indications of any effect from the interviewer operating on the 


respondent. He reports no such influences, and examination of his answers 
fails to suggest any. 


The direct observation of such an interview and some of its peculiari- 
ties stimulates us immediately to think in new ways about the interview 
situation and the process mediating interviewer effects. Whether this 
particular situation is common is beside the point. It is the unusual event 
that may be the very basis for new theoretical developments. 

Here is one example of an interview situation which by all the rules 
ought to be an extremely poor situation for the collection of valid data. 
In addition to the intense hostility of the interviewer, she reports that 
she “was particularly worried and depressed” that day and “in a special 
hurry to complete the interview” and the interview was conducted in 
the street. Further, the interviewer was in definite ideological disagree- 
ment with the respondent. 

The case hardly is in accord with a conception of the interview which 
sees both parties reacting strongly to one another, with the respondent 
attuned to the ideology of the interviewer, and responsive to it. This 
respondent is apparently unaware of the interviewer’s feelings. Yet, this 
is not because of any intellectual deficiencies on his part or apathy about 
politics, since he is a well-educated, middle-class person who says about 
himself: “I’m highly interested in political questions and I’m fully aware 
that the relations between this country and Russia is the basis on which 
my own family could live or die. I’m a Catholic and I firmly believe that 
what Russia is doing does not have God’s blessing.” He then expanded 
upon Soviet-American relations for a while longer. 

It is clear that the content of the res 
tions, be completely unaffected by stro 
ideological disagreement on the part 
paradoxical only in relation to the pre 
viewer’s sentiments being transmitted 
exactly what this interview situation 


ponses may, under given condi- 
ng undercurrents of hostility and 
of an interviewer. And this is 
conception which sees the inter- 
to a sensitive receiver, which is 
was not like. The respondent 
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seemed to have as his motive for being interviewed the desire to "sound- 
off.” He had well-formed political opinions and his main interest was in 
the actual questions. In addition, his ideology seems well supported 
psychologically, and he therefore feels no insecurity in expressing his 
own view. Thus, when asked if he was concerned about whether his 
opinions were like others, he remarked, “Yes, but I feel that I expressed 
the feelings of the major part of the American public—even in the deli- 
cate Negro question." This despite the fact that there was no “delicate 
Negro question" in the interview. The respondent essentially remained 
detached from the social features of the interview situation, showed no 
insight about the other party and thus was not influenced by the under- 
current, 

Just as the respondent may be insensitive to the attitudes and feelings 
of the most vital interviewer, there is also the good possibility that some 
Interviewers are not responsive to the most flagrant behavior of a re- 
Spondent. Interviewers may well develop a professional attitude toward 
their work so that they seldom become fully ego-involved in the situ- 
ation. It is only when we conceive of the interview as equivalent to a 
natural conversation, in which both parties initiate or break contact or 
Teact to each other for reasons of personal whim or preference, that it 
Seems strange to think of the interviewer as being able to withstand such 
experiences, The physician reacts to illness differently from the layman. 
It is part of his day's work. The psychiatrist is accustomed to reports 
that might horrify the ordinary man. So, too, the professional inter- 
viewer may be task-oriented and treat peculiar and annoying respond- 
ents as part of the hazards or normal experiences of his job. 

Let us turn for additional evidence to a somewhat different type of 

ata. The mutual experiences of respondent and interviewer within a 
S'ven interview were one avenue to revealing the phenomenology of the 
Interview situation, Another avenue was the reconstruction of oze side 
of the situation—the interviewer's—through long narrative accounts of 
the totality of his experience.? 

ote the objective way in which another interviewer—G—describes 
wu, einge during what must have been a hair-raising day even fora 
y interviewer: 


as E. ‘member опе day when I ran into a woman with a beard—she looked 
өг 181 she might be a freak in а circus. But when I got in she was terribly 
ran int and really better informed than the average. And that same day I 
Cdi to a household with an idiot child, and the woman just said, Well, 

16 in," and she explained about the child and we went on with the inter- 


v : 
lew. I was kind of nervous though. I didn’t know what he'd do. Every once 
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є ; Nw 
in a while the child would make sounds I didn't honestly like. And wasn’t it 
interesting—The same day I ran into a couple who were quarreling, But she 
was perfectly lucid. She'd answer the questions calmly—then turn and re- 


sume the personal quarrel with her husband. Once in a while he'd try to 
answer—but she'd cut him off. 


Or take the report of K, another experienced woman interviewer. 


When asked how she felt when she ran into people who were preju- 
diced, she replied: 


answered: “It depresses 
me at times—but I don't need a psy chologist—ir doesn't get me down. It in- 
terests me enough to discuss it with friends—ir’ 
PEN frankly think on that, it disturbed me ve 
done it so long now, I know what to expect. I’m 
standing) is as low as it is. 


it. It’s more to me, on your Surveys, a complete and total lack of interest in 
the questions we ask. 


++ But as creatures of habit, after you're accustomed 
to it, it doesn't hit you in the eye any more. It does momentarily incite you. 


While there may well be many intery 


lewers whose feelings remain 
ndents, it is perfectly possible 
onduct. Feelings are one thing 
— vert conduct another. It is purely an assumption, based on little fact, 
to conceive of the interviewer's feelings spewing forth in all directions. 
Let us for the moment accept the testimony of these interviewers at its 


face value. A hichly experienced woman interviewer—K Q—describes 
"ic 
her strong feelings about some respondents: 


5, what people think of national and interna- 
ns every damn person so acutely. The fact that a 

i Xpressing such opinions angers me. It's 
sex which hold themselves above such 


went beautifully. Then I got to th 


© question on atomic energy, and she 
pointed to her small son and said, ‘ 


‘How can I pay attention to such things, 
te care of.” My unspoken reaction, natu- 


u take care of him, if you don’t take care 
ау be wasted. . . 


rally, was “No matter how well yo 
of atomic energy, all your care m 


Yet she then goes on to say: 


Of course I simply smiled—] don't think I showed my reaction. That 
n of remaining sw 
blank thing. Pm a person with very stron 
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And she implies a kind of fragmentation between conduct and affect: 


I get sick over the answers. But the part of me that gets sick and bothered 
is the socially conscious part. .. . One part of me gets disgusted, but the 
other wants to find out as a basis of action. Statistics on anti-semitism disturb 
me—but you've got to start from some point. You need to know what your 
points are... , 


She further indicates how a “task orientation” intervenes. Thus, when 
asked whether that part of her rebels while she interviews, she replies: 


Yes, but afterwards. While getting the interviews you're also engaged in 
a lot of drudgery—the basic drudgery of getting the job done. The other 
Part gets lost. Very often people will ask, “What do people say?" I don’t 
know. I can’t remember at that moment. . . . The actual opinions don't 
register from one to the next interview. Only at the end, when I look over 
all of them, the pattern hits me in the eye. Then I get unhappy. 


Another highly experienced male interviewer—MA—reports the 
same violent affect over the answers of respondents: 


There's something gnawing at mv faith in democracy. I’m nowhere nearly 
15 sure as when I was in college that the people are fundamentally right. 
More likely, the people are wrong... . I сапт say any more, “Give the 
People their head, and all will be well.” People are much too pliable—they 
Will act strongly on issues on which they have only the vaguest understand- 
Ig. . <3 It’s all a cause for profound disheartenment. 


Yet when asked what he does about this, he again stresses the sepa- 
ration between conduct and affect: 


, llay it on the side. I think I’m fairly successful as an objective interviewer 
1n presenting a front of complete impartiality. I've learned not to be sur- 
Prised or shocked, For example, when I’ve worked in the South and run into 
: lississippi farmers who launch into a diatribe about New York Jews?” Dt 

hat do I do about it? I neither agree nor disagree. If I'm pressed into 
expressing an opinion, I try to be as vaguely noncommittal on their side as I 
can. The few times I’ve worked on surveys with basic social meaning, I’ve 
tried to get as accurate and objective a picture as possible of what the person 
thought. No matter how disagreeable the medicine is, you have to take it. 

here’s no point in attempting to start any attitudinal interplay. It would 
have an influence on the respondent's opinion. I try as hard as possible not 
to influence them—I don't really know if I achieve it. 


And later he indicates that such affect can find its issue in more radical 
ways than in the conduct within an interview. Thus, when asked why 

° Continues to be an interviewer in the face of this disheartenment, he 
replies: 


Pho Says I do!! That's one of the basic reasons I left the field. For a 
While it was a very serious thing with me. I was profoundly disaffected . . . 
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I was very upset by іс... but I was naive . . . I still had hopes. It didn’t 
really become serious till after I had done a great deal of interviewing. 

Thus we should, at least provisionally, admit the possibility that some 
interviewers, despite violent reactions to the ideology of the respondent, 
may not reveal this in their conduct toward him. Their orientation to 
the task may intervene to disrupt such feelings. They may be strongly . 
aware of their volatility, but in the light of long experience and admoni- 
tions about bias, they may be able to control their conduct toward the 
respondent." Such control, such temporary fragmentation of the per- 
sonality of the interviewer, is possibly a function of the degree of in- 
tensity of feelings aroused in the interviewer or of his habituation to the 
experience or of his training. That indignation or disagreement may be 
communicated and may bias the interview under other conditions is of 
course not to be denied, but this must be regarded as a function of spe- 
cialized factors. We are indebted to the writer James Stern for his in- 
cidental revelation of his experiences as an intensiv 
the U.S. Strategic Bombing Survey of Germany." As an individual with 


no previous interviewing experience and great sensitivity of feeling, he 
was not hardened to the following interview: 


e interviewer during 


I'm getting along under the Occu- 
ricked balloon all the life that was 
oche hat seemed to collapse. Only 
5, as they quickly rise before dis- 


came up to hold the dropped head while the 
words gurgled out as from a body saturated in water. 


And he reports his reactions: 


Well, what do you do and say, 
your fatuous Fragebogen, its questio 
domestic problems, the military an 
about what plans she and her famil 
little hell called the future? What 


you damned Gallup poller? You, with 
ns about prices and taxes, about wartime 
d political leaders already dead or jailed, 
y have for the future, that charming rosy 


do you do and say with all that Galluping 
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and a nervous system like a coil of taut and quivering copper wire? What 
do you do and say? 


But that he was not typical is clear—Stern continues: 


I once summoned up the courage to ask a tough, square-faced sergeant 
that, after he’d been knocking what he called “the bull-shit outacrying 
Krauts.” I asked, not because I knew he was a psychologist by profession but 
because I knew he was a different kind of a worm and I wanted to try and 
learn a lesson. “What did I say,” he said, as though what he said was all 
there was to be said. “Why, I said, Madam, you better quit that blubbering 
quick, we gotta long way to go yet and they ain’t gonna keep my dinner 
Warm on accounta you, that’s what I said, and Jesus, was my dinner cold, 
no sirree.” 


That an interviewer such as Stern may flagrantly bias results by the 
Most direct communication of sentiments is clear from his running ac- 
Count of another interview: 


“Did I blame the Allies for the airraids? Ha, why naturally, we never 
once raided America. England? England started them. England.” 

“England started the airraids,” I repeated, dropping the smile now and 
barely asking the question. “England started the bombing of open cities and 
villages? England, I suppose, started that before the Germans flattened 

uernica in . . ." 

d don't know anything about Guernica . . . and « «2 

INO, of course, you wouldn't." А А 

І know England started the air warfare against Germany by bombing 
Freiburg and Karlsruhe in 1940, in May 1940 and . . .” . 

"And Germany, of course," I said, managing the smile again, "bombed 
Warsaw and Rotterdam in 1941/ And, of course, Germany never declared 
War on England . . .” 

“Of course not, the English declared war on us.” . 

“Well, well,” I said, “That’s very interesting, just why did England declare 
War on Germany?” 

"Why? Why, how would I know? (Aus Feindschaft gegen uns) From 
hatred of us, I suppose." i 
I let the laugh out and said, “Did you ever listen to the Allied radio?” 


m Р Блоа 
я Һе... Never" was spat out like venom striking tin. 
Never?" 


“Never, I said.” | 
Oh, well,” I said calmly, smiling, “Qh, well, that explains а lot. 


, 


218 


. Perhaps we have gone too far in thinking that the danger from the 
Iterviewer's strong feelings is that they might be communicated to the 
Tespondent and affect his replies. Experienced interviewers may be well 
aware of this. All the primers warn about it. The greater danger might 

€ that such feclings affect the perception or judgment of a given an- 
SWer or the private decision as to the validity of the answer and cause 
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bias in such areas of the interview as the recording or probing operation. 
Here there has been little admonition to the interviewer, probably in all 
likelihood because our basic conception of the interview directs us to 
the communicational features, and not to these other components. 

Let us turn now to another interview 
ples. In our first case, “The Creep,’ 


of the interviewer, there was no perceptible effect on the respondent. 


He was detached from the social impact of the interview, because of a 
firm orientation to the issues involv 


somewhat different processes at w 
fested no strong feelings about th 
unlikely that there would hav 
pattern of behavior of the resp 


‚ illustrative of different princi- 
' despite great hostility on the part 


ed. In this new situation, we perceive 
ork. The particular interviewer mani- 
€ respondent, but even if she had, it is 
e been any biasing influence, because the 
ondent predisposed against it. 

This proprietor of a liquor store in Brooklyn had been interviewed by 
a female interviewer. Here is the reconstructed pattern. The orientation 
of the respondent—a self-defined “tough guy”—seems to be a com- 
pound of cynicism, generalized hostility, and detachment from the so- 
cial process because of egocentrism. 


Here is his orientation to the interview situation as such: 


He began the session with some ne 
about public opinion polls. When ask 
viewed, he said: “I didn’t want to be i 
her feet off ГЇЇ help her out.” But h 
the interview.” This apparent 
only suggestion of any 


gative comments to the interviewer 
ed later why he wanted to be inter- 
nterviewed. Naturally, if she's walking 
е added: “Not that I saw any point in 
note of sympathy for the interviewer is the 
positive response to her as a person. Ў 
he cynicism and hostility and complete detachment may be best indi- 
cated in his summing up of the experience. He said: “This here interview 
thing’s a bunch of - I think it is a backdoor way of getting information 


for a commercial outfit—A. congressman is still going to vote for whoever 
he wants to.” 


erview pretty well. He 


and comments—“I don’t know—it was in 


onversation like 
nd remember.’ 


question as to whether the i 
nfavorable impression, he Says: 


Dterviewer created an initial 
"I wasn't concerned. I’ve seen better 


"Neither, no impression" and 
looking dames.” 


remarks, 
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With respect to any biasing influences from the interviewer, there is 
no evidence from examination of the entire protocol that his responses 
were at all affected. Conceivably, one might argue that the respondent’s 
hostility represents the biasing influence of the interviewer's personality, 
but it seems entirely as likely that his hostility is diffuse and would have 
asserted itself with any other interviewer. 

There are occasional bits of evidence of an orientation to the inter- 
viewer, and a concern about her, but this is mixed with other patterns 
which predominate. He says that he thought the interviewer “liked” 
him and that "she seemed to be satisfied that I was giving her the proper 
answers.” But this is contradicted by other blustering remarks to various 
questions. For example, when asked whether he was concerned if his 
answers were like most other people's, he replied, “Never thought—I 
know my opinion is different. It’s no news to me.” And when asked in 
what way he thought the interviewer might have found him different 
from most of the people she talked with, he said, “I don’t know these 
things—I'm not interested in what people think of me." And later he re- 
marked in answer to the explicit question as to whether the interviewer 
seemed satisfied with his answers, “Yes, she had to be.” 

While this hostility is operating within the respondent, what is the 
View of the situation in the mind of the interviewer? The interviewer re- 
ported that he expressed "some hostility" when he was first approached 
and that the main reason he submitted was that he “was being courteous, 
found it hard to say no.” The interviewer's reaction to his initial tirade 
about surveys was “he let me have it about opinion (surveys) in gen- 
eral. He did this but was very pleasant—so I went ahead and I was glad. 
He seemed a very decent sort.” 

In relation to the generally negative attitude of the respondent to the 
entire situation, the undercurrent of hostility and cynicism and con- 
tempt, the interviewer seems to show a strange lack of insight. 

A variety of conjectures suggest themselves in relation to this case. 
It would seem that just as a respondent may be untouched by an under- 
Current of activity on the part of an interviewer, so, too, may an inter- 
Viewer be oblivious to the affect within the respondent. And perhaps it 
Is just as well. Insight under either of these conditions would disrupt 
Tapport even further and perhaps touch off effects that would distort 
the answers, 

It seems suggested also that a respondent with this type of personality 
and orientation to the interview would be untouched by biasing tenden- 
Cles on the part of алу interviewer, assuming they were operating. In 
addition to the hostility and cynicism, he was detached from the social 
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features of the interview because of egocentricity. Thus to one question 
in the actual survey: “What do you think of the problems facing the 
0.5. today, which one comes to your mind first?” he answered, “My 
own problem,” and in reporting about his experiences in the Pone 
he never mentioned a single question that had been asked, and sceme 
to show no interest in the original questions. 


“Good” Rapport in Relation to the Opinion-Giving Process 


The first two cases reported depart from the traditional conception of 
the way in which the interview situation is structured and from our as- 
sumptions as to the process by which bias is mediated. Despite poor 
rapport and hostility on the part of one of the parties, there was no bias. 
Let us examine now a case which is the prototype of the good interview 
situation and observe whether bias operates. The general interpersonal 
atmosphere of the situation can be quickly conveyed: 


The respondent invited the interviewer into her home, offered to take her 
hat and coat, and even offered her some food, a rather unusual occurrence. 
The atmosphere seemed very relaxed—the respondent was so folksy, it 
couldn’t have been otherwise. The high point in rapport was typified by the 


respondent’s later remark about the interviewer: “She had a headache and 
wasn’t afraid to ask me for some aspirin. I was glad she felt like she could 
ask me.” 


The affection was definitely reciprocated. Both parties reported that they 


would like to know each other better, The interviewer said of the responi- 
ent, she “was so sweet and friendly she had no impulse at all to refuse a 
chat with a stranger.” She also c 


ommented about the respondent: “While 
not mentally stimulating, her innate kindness and optimism is most attrac- 
tive." The respondent, in describing her initial reaction and motives in being 
interviewed, said, " Just because she came to the door and seemed like a nice 
person and had some qu 


estions to ask me." 
_ A further bond between them was found in the fact that the interviewer 
and the two sons of the respondent had attended the same local university, 


Or a kind of class solidarity. And there was in fact 


no marked class disparity or difference in ideology. 


The whole interview situation see 
Women friends having a “hen party.” 
nance in the situation, nor was there an 
the respondent definitely saw this a 
strongly to the interviewer, 
content. There was a nice b 
and the questions. The ori 


med to be in the nature of two 
There was no note of any domi- 
y evidence of hostility. Although 
S a social situation and reacted 
this was not to the exclusion of the survey 
alance of interest in both the social situation 
entation of the respondent to the interview 
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per se was satisfactory. She was matter of fact about it, but nevertheless 
definitely interested and highly conscientious. Thus: 


She reported a real interest in the questions and felt a great responsibility 
to answer correctly. She commented on the use of the survey results: “I 
didn’t think it would make much difference—unless they might bring it up 
in Congress. That's why a person should be very careful about answering 
50 as to give the right one.” The interviewer's evaluation is of the same 
order, “She tried hard to get the real meaning of each question.” The re- 
spondent’s sincere approach is conveyed by her last comment: “I figure 
ety has started something to try to better things and I think that’s 

пе,” 


But this conscientious devotion to answering the questions never 
reached any dangerous intensity. The situation was not felt to be a test, 
and there was no terrible need for the respondent to do well. While the 
respondent was not very knowledgeable, this did not make her feel in- 
adequate: 


The interviewer reported that she “felt her lack of knowledge was com- 
mon to women, so was not embarrassed,” and the respondent said, “I was 
Wishing my husband was here to answer the questions—he knows more 
about it than I do.” This remark did not seem to reflect any feeling of per- 
sonal inadequacy, but would seem more an expression of what she accepted 
as her culturally defined role. It was all right for women to have inadequate 
knowledge since this is not their proper domain. There was no sign that the 
Woman interviewer expected any more or resented the respondent on this 
account, 


Yet what mars this ideal picture is the intrusion of an interviewer 
effect: 


Accor ding to the interviewer’s remarks there was no bias: “She asked me 
What I thought of sending food to Russia. I did not reveal my opinion.” But 
while the respondent said, “She didn’t try to change my opinion," she 
also said: “Once in a while I asked her how she felt and we seemed to agree 
pa our Ways of feeling.” She also reported that the interviewer agreed with 

ег opinions, as indicated by “just her way of talking. Now it may be that 
she didn't but she didn't let on that she didn’t.” 


Let us speculate about this case. Here was a situation which by the 
traditional view of proper interviewing had all the desirable elements 
~no marked disparity in group membership, excellent rapport, no hos- 
ч ity or sharp divergence in ideology, considerable social interaction, 
Willingness of the respondent to assume her role and the requirements 
of the Survey seriously yet no special insecurity about her opinions, no 
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explicit communication of biasing tendencies, and insightful handling by 
the interviewer. What then is wrong with it? It was too good! The 
identification with the interviewer was too great; the rapport was too 
much and the respondent seems to have been biased in the direction of 
compatibility with the interviewer's sentiments. However, this case is 
only paradoxical in relation to our preconce 
terview conditions for the revelation of atti ; 
fied the picture. We have assumed that great rapport and friendship 
patterns and a lot of social interaction are requirements for good inter- 
viewing, without ever observing the precise operation of those factors 
upon the behavior of a respondent. Carried away by the emphasis on 
Tapport, we have perhaps vulgarized the concept and have mistaken 
"love" for rapport. And interviewers may have followed suit, and 
striven for great chumminess with their respondents. A certain degree 
of businesslike formality, of social detachment, may be preferable.” 
When rapport transcends a certain point, the relationship may be too 


. . P B »^ 
intimate, and the respondent may be eager to defer to the interviewer's 
sentiments. This would se 


has little real involvement in the task 


ptions about the proper in- 
tudes. We have oversimpli- 


or even prefer to take ov. 
Perhaps, w 


y hello. It may make the Opening easier, but the 


: eighbor. There are two factors in- 
be friendly but invalid, or less friendly but more 


ewing, if Í get too friendl nt to 
make an adaptation to me . , B Whe p a pitt даг Вы 
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their ability to obtain optimum rapport. For example, here are the facts 
according to K—a highly experienced woman interviewer: 


When asked if respondents were interested in her or the questions, she 
replied, “It’s pretty equally divided. There’s a great interest in you—in 
what you're doing, what it’s all about. . . . There's also a great deal of 
Sympathy offered an interviewer for having a very tough job.” When asked 
if the respondents were interested in her personally, she answered: “Yes, 
unfortunately. (They ask) do you make a lot of money at this? Do you 
like to do it?” When asked if they ogled her or examined her clothes she 
replied: “Not too much, but you expect a certain amount of it.” When the 
question was put as to whether she felt they were interested in her opinions, 
she replied: “Very definitely! They ask me mine, before they give theirs— 
only too often... . They also ask after giving their answer—'Am I right,’ 
Do you agree with me?’ ? 


_ Note the difference in the report of MA, a highly experienced male 
interviewer: 


When asked whether the respondent's focus of interest was on him 
Personally or on the questions, he replied: “They’re interested in all those 
things in varying degrees. 1 don't think there's nearly as much interest in 
Me as in what it’s about. . . . The focus of interest, I think, is very rarely on 
the interviewer—on me as such. I never feel self-conscious, or been made to 

eel self-conscious. I’m not aware of personal scrutiny after the first minute 

ог so. Beyond that point there’s not too much curiosity.” When the matter 
Was pursued, and he was asked what types of respondents evinced an interest 
in him, he was vague: “It’s hard to give an accurate answer, I should say, 
and it’s almost always momentary. (It occurs) at the beginning of the inter- 
View. It occurs when I'm not native to the area where I'm working.” 


| Or take the report of M—a highly trained male interviewer with at 
Cast equal ability in making rapport who works in the same city with K: 


seem asked if respondents look to him for guidance, he poa Nes 
ELS do they say ‘what do you think’. . . It doesn't happen often. say 
Y with one per cent of the cases," one per cent or less. He does remark 
ater in another context, “Oftentimes when you've finished the questions, the 
Person will say, ‘Well, how did 1 do—did I answer about the way most 
we erybody else did?' " But when the matter was pursued by the question as 
т whether this reaction was characteristic of special ipae ле was not 
ТУ Certain: “I w 7 it’ the more intelligent sector 
Who ask thet. 1 peces fé s p E rq men than аата То 
suc questi Di quce uem * "ith the subject matter of the 
Sings On as to whether this reaction varied with Е] Јес и 
вае? he replied: "I can't give anything on that. Wait a dicare ou ея 
think me җы Sticks in my mind that some surveys ask w E people 
Opinion e than others do. But that doesn't make sense, Since they're all 
Surveys. I guess I haven't anything sensible to say. 
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The tentative guesses to be made from these protocols about the fac- 
tor within the interviewer responsible for this difference in the orienta- 
tion of the respondent is that it lies, in part, in a kind of intrusiveness of 
the interviewer, a tendency to want to enter deeply into the respond- 
ent’s affairs, which naturally increases the orientation of the respondent 
in the direction of the interviewer. In part, it may also derive from an 
emphasis in the interviewer upon the prestige-value of possessing opin- 
ions and other things. Perhaps this latter concern increases the feelings 
of respondents that they must voice opinions, even when they have 
none, and they may try to absorb them from the interviewer. 

Note the continual thread running through K’s report about her ex- 
periences as an interviewer. Among her early remarks, prior to any 
inquiries about it, she comments: 


“If your second question was about Russia or Japan, or Greece or Turkey, 
they’d fold up (terminate the interview). They were afraid to show their 
ignorance.” Then later on, she says, “Then also the question’s asked— Did 1 
say the right thing?' You get a lot of that. They take it as an IQ." And again 
later, she reports: "Others, I believe give an opinion that means exactly 
nothing to them . . . and they're ashamed to say ‘I don't know’ despite the 
fact that it's quite all right.” Апа later on with respect to a discussion of 
probing in the interviews, she says: “You can’t be too persistent . . . other- 


wise there'll be too much embarrassment, and they'll discontinue the inter- 


view. People have a great deal of ego as far as the lack of opinion or knowl- 


, 

edge on a subject. They don't like even before a stranger to show they don t 

have an opinion on it. You frequently find they'll become arrogant—or 
assume a disinterested attitude." 

She does at another point in the interview mention this contradictory 

note: "If they really don't know and say so, that’s all right—that’s part of 


your job. My reaction is just as satisfactory as if it’s fluent. I’ve had people 


tell me after a ‘don’t know’ answer, so that you’re convinced of their sin- 
cerity, that ‘I’m going to learn about these things. . . .' That's satisfactory 
because you're completely convinced that you've had a genuinely goo 

interview, even though most of the answers are ‘don’t knows." " 


Now, while it is certainly true that many interviewers report €n- 
countering this reaction of shame when a res 


pondent appears ignorant; 
and it must occur in reality, 


| the pervasiveness of this theme іп K's ex- 
perience must have something to do with her own particular behavior. 
For example: in the report of M on his experiences—a lengthy 17,000 


Word account, there is hardly a mention of the problem. Perhaps K 


liberates this atmosphere in her interviews because of the prestige-value 
of opinions in her own mind.?? 


Note also, in the two reports, the difference in the personalities of the 
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two interviewers and the gratifications they obtain from the experience. 
K remarks: 


« " " ” 
Ima very friendly soul. I never go anywhere without someone speaking 


to me. I enjoy it. . . . If I had to go out and get me a job, Га try to get 
into personnel work. I like to speak to people—hear their ideas—analyze 
the different types. . . . Pm just genuinely interested in human nature— 


human beings—their behavior—what makes them think as they do. 

"When you live in , you travel in a certain sphere, and they bore 
me to tears after a while. There's a certain sameness and this is a perfect 
interlude. My husband says, ‘You sure know some screwballs.’ That’s right! 
Xou can’t take the same thing for a steady diet. There's something interesting 
in an intelligent screwball. . . . I can give you a concrete example. I met a 
kid, 20 years old, . . . a cultured smart boy. He was working as a bank 
clerk, but he was giving it up. He was going to learn to be an embalmer—it 
intrigues me why this kid was going to be an embalmer and I found out. I 
don’t want to listen to these same damn people with the same ideas all the 


time. I would never meet a kid like that socially—or if I did, it would be a 
Tarity,”29 


But M describes his gratification in interviewing differently. He says 
about himself: 


“Every fresh person encountered is a new experience, I say this as though 
pes à person terribly interested in people, but I’m not. I don’t know what 

© answer is. I’m fond of people, but also strangely capable of getting along 
Without them.” When asked at another point what was gratifying in the 
interview, he replied: “I think that’s epitomized in the hosiery survey where, 
800d God! asking 3000 women a stupid question like that would be the most 
Toutinized inquiry. In that case, I’m a theoretical enough guy so that I be- 
same terribly interested in what the pattern of stocking buying was. At 
‘nother point, he remarks: “A propos of that, I’m not very much interested 
in people—though I’m conscious that isn’t altogether faithful to the truth. I 
st can’t tell you about myself. I haven't the bubbling interest in people 
Me many an extrovert has. I seem to enjoy people most when I come to, 

at we might call, intellectual grips with them. 


‚Моге also how this interviewer has either no intense desire to intrude 


imself too deeply into the respondent, or at least is highly guarded 
3gainst this tendency: 


he remarks: “She was a little em- 
shat seemed to be almost her living 
bably have a quality of disarm- 
he person . . . a sense of my 


I ; Gad " 
es recounting a certain interview, 
а to have me come upon her in w 
їп otters, But at such a moment, I think I pro 


in UA 
е g Simplicity—at any rate I try to convey to t : 
omplet, ? Later on, he expands on this 


theme: € unawareness of surroundings. . . -' ә | 
there. ‘I realize that if Га been interested in anything other than getting 
iT attitudes, I would have also been less objective. . . . No one whom 


ve Р : $ 
ine dierviewed has ever been aware of my eyes wandering to their sur- 
ndings of their home," 
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Similarly, MA shows no strong interest in the respondent or tendency 
to be intrusive, and he guards against the dangers. Thus at one point he 
says: 


“Опе thing I have found with the Jewish group—whenever Гуе come 
into a Jewish household, and come into contact with something familiar, 
and identified myself as Jewish—l've invariably noticed extreme and strong 
reactions. You get snatched up. It’s so obvious that there’s a strong chance 
of coloration of the response that it’s something I’m wary about. I try to 
keep that out of the interview till the interview is over.” And while this 
interviewer does describe a very strong interest in his respondents, this 
interest is of a very specialized sort. Thus when asked if he was interested 
in the respondent himself, he replied: "Yes, but how interested can you be. 
Im interested in his attitudes and combinations of attitudes. The average 
middle class city home bores me." 


This third case history of an interview and related material from the 
interviewers again suggest some modification of the usual view. Some 
degree of sociability on the part of the interviewer is obviously nceded. 
Some degree of rapport is obviously called for. But there needs to be 
some clarification of dimensions and types of rapport and of desirable 
forms of sociability. Sociability that is predicated on intrusiveness may 
increase the orientation of the respondent to the interviewer, to the 
point where bias is more likely. 

Modification of our usual preconceptions ultimately leading to better 
theory was one product of the case study of the interview situation. 
Established concepts were re-examined and a more refined view of their 
relevance to the interview was obtained. This, in turn, led to systematic 
empirical work on interviewer effect, which will be reported in later 
chapters. 

In addition, in conjecturing on the diverse phenomena already re- 
ported from the case materials, we were led to recognize the larger 
significance of concepts previously neglected. The recognition of these 
concepts, in turn, sensitized us to new phenomena implicit in the case 
studies, and led to further theorizing. 


Role Prescriptions and Interviewer Role Conceptions 
in Relation to Interviewer Effects 
Again we shall temporarily defer any elaborate discussion and listen 
to М» remark in the course of recounting his experiences. Prior to this 
point in the narrative account, he had dwelt on the tensions and alterna- 
tion of elation and depression that occurred during his field work. He 


had then been asked whether such affect interfered with his actual 
work. He remarked: 
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"You'd suppose that the tension would influence the character of the 
work done by an interviewer. I mean specifically the way the interview 
itself is carried through. But | am inclined to feel that once started on 
Question 1, the interviewer falls promptly again into a rather set way. I 
don't mean that he interviews like a machine, though perhaps I do mean 
this. He is doing a routine, and from the moment of initiation till he's 
through, he's pretty largely controlled by the more automatic mental 
processes. . . . You sce, when you're interviewing a person you're rather 
an automaton—you're back in vour routine, and you're caught up in it. You 
aren't an independent person, a free agent? You're not that till you've left 
the presence of the person, and embarked on the wide sea of searching for 
the potential next person.” 


In part, M is merely repeating what we have already reported in the 
other interviewers—he reports what we have labeled a “task-orienta- 
tion” or a “fragmentation” between conduct and feeling, but he em- 
Phasizes as the explanation something generally neglected, when he says 
he is not “independent,” not “free” when he interviews. It is prescribed 
that he behave in certain ways simply because he is an interviewer, and 
It is this prescription of the “interviewer's role" which intervenes be- 
tween his conduct and his own private feelings or ideology, between 
the stimulation from the respondent and his more natural reaction. 

_ Upon consideration, it is quite obvious to anyone that all survey agen- 
Cies define in a formal way what is the proper behavior of the inter- 
Viewer, and the case studies were not required in order for us to know 
this, However, the case studies do stress that such roles are accepted, 
and this has been too often neglected in the attention we have given to 
the “natural” processes within the interviewer which presumably oper- 
ate to cause bias. 

_ Yet the maintenance of the prescribed role is not alway 
Intensive interviews indicate that at times conflict is felt between the 
requirements as set down by the agency and what the interviewer feels 
'S a legitimate deviation required to meet certain problems. Bias then 
Occurs not out of ignorance, but because the interviewer decides he bas 
to flout the rule. Thus, M, the very interviewer quoted above as accept- 
ing the prescribed role, remarks on a hidden crime while conducting an 


Interyj ; 
€rview with a foreign person: 


zs easy. The 


a 
te d felt qualified to paraphrase with strictest faithfulness to the sense. 1 
„alze that this is indefensible so will make no attempt to defend it. 
i ide feel in doing as I did that I performed conscientiously as an inter- 

er in a public opinion survey." 


wn Pressures of given situations in causing deviations from the ac- 
Pted role is also demonstrated in the remarks of KO in discussing the 
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unpleasant respondents she periodically encounters. She was asked how 
the unpleasantness affected her: 


When the respondent lets you in on sufferance, you feel a sort of obliga- 
tion to get the interview over as quickly as possible—with the least Шиг 
to the respondent. You have a sense of pressure—it's pretty unconscious. 
the other hand when you're received cordially, you have a more leisurely 
feeling—you're not afraid to keep repeating the question if you have um 
slightest suspicion that the respondent doesn't understand. You probe mor 
completely. 


The impact of a variety of situational pressures on the interviewer's 


normally accepted role is seen most clearly in another type of phe- 
nomenological data collected. For reasons to be described later, the 
interviewer listened to an electric transcription of a completed inter- 
view, was asked to imagine himself in the actual situation, and was given 
the task of recording the answers on the appropriate questionnaire. He 
was also asked to report any thoughts or reactions he experienced while 
doing the task.? Pieces of B's narrative show the difficulties he faces in 
maintaining his prescribed role. 
After Question 1: 


"I feel this is one of those interviews whe 
and copy it over." 


"I hope he'll stick to the questions. 
may interfere with my proper intervi 


After Question 6: 


re I'll have to record quickly 


I'll probably get very bored and that 
ewing technique with him." 


The interviewer didn't have to continue probing. . . . He feels he has 
answered it and you don't. Rather than ask him again and antagonize him 
(the third time you ask it, it is really dangerous because he's liable to get 
very annoyed) I would have coded it. 


After Question 8A: 


I started to get that helpless feeling. He did not answer the question and 
I was forcing the answer out of him. You have to force him, but as you 
force him, he reacts by feeling more strongly. 


After the very lengthy Question 11: 


These long ones give me trouble. Since it’s such a long question, I wonder 


if their answer relates to the question as a whole and I have to quickly read 
It over again. 


After the very lengthy answer to Question 17: 


I feel irritated. I have no room—* І have to write all over the place. How 
can you write verbatim if there's no place to write verbatim. I get very 
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irritated. I don’t feel I can get it down this way (verbatim). If I have to 
Start interpreting what's important, what's relevant and what's irrelevant, I 
do it in terms of what I think. Here there is no time to determine it in ob- 
Jective fashion. Here you have to come to a decision in terms of your own 


likes and dislikes. I get doubtful. Am I writing down the things which really 
are important? I may not be objective in that I’m picking out certain things 
and leaving out others. 


The case studies thus not only reveal the importance of the role pre- 
scribed for the interviewer by the agency in inhibiting natural biasing 
tendencies; they also reveal the importance of situational pressures in 
Shattering the normal role with consequent bias. And what is suggested 
15 that as such a role is shattered, the interviewer is forced into certain 
types of biasing behavior as a “task aid,” as a means of coping with the 
Problem, 

Beyond this, they reveal the importance of idiosyncratic definitions 
of the role of the interviewer in producing bias. While the role is pre- 
scribed by the agency and usually maintained by various enforcement 
Measures or by the interviewer's sheer acceptance of it on the basis of 
knowledge of the agency’s demands, there may well be conflict with 
other definitions of the role proceeding from a variety of sources. For 
example, the interviewer may have views as to what other interviewers 
Ог his immediate field supervisor or particular respondents regard as 
Proper interviewing behavior. While we have no evidence as to such 

rect social influences on the definition of the role, we do have consid- 
erable evidence that the definition may often proceed from certain be- 
tefs the interviewer has as to the nature of attitudes, the nature of 
respondent behavior, or the quality of the survey procedures, although 
there is the possibility that they may also provide gratification for vari- 
ous needs, 
Note the recurrent report by F of a certain kind of probing behavior 
While interviewing and the reasons for this behavior: 
age Dot satisfied with а “yes-no” answer. I probe into it to make sure they 
Stand the question. I often get “no”; it’s not really a “no” answer—it 


n B . а. 
leek, bea “yes” answer. Frequently, the answer 1s due to minda foe р 
* knowledge. I probe just a bit even though the interview doesn't ca 


or ; 
Probing on "yes" and “no” answers. . . - 


« hs ISsue was later pursued by asking her why she probed beyond the 
es? and “No”: 
a E. "Yeses" are all shades, some “Yeses” are close to “Noes.” You read 
iffe miee to the respondent—he’s only catching the essential words—it’s 
Pretatio to know what he considers essential—you'll never know his inter- 
n. So, I probe. 
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And she continues: 


On Survey 152, on Question 1 (Can Russia be trusted? ), you usually get 
what he'd Jike to see—that Russia should be trustworthy.” That's not the 
question—when I get such a “Yes” answer, and then probe, I may get that 
it’s impossible (to trust her )—the "Yes" may change to “No.” Also on the 
question on whether they expect a war, you also get wish fulfillment at 
first. If you're going through it quickly, you may not uncover his real 
opinion on the given question. 


She was then asked how she knew that the question was misunder- 
stood: 


I read the questionnaire before I get started. I could readily see that the 
question was colored by political factors. Respondents will frequently be- 
come excited—you'll get a lot of wish fulfillment. On the whole I probe 
wherever possible. It isn't a matter of selecting certain questions in advance 
to probe on. I see in the course of the probing and interviewing the diffi- 
culty—the specifications give you a lead on that. 


She reiterates the basic point: 


I usually try to veer away from "don't know" answers. I probe especially 
hard. I usually feel the “don’t know” is a cover up for inadequate informa- 
tion. I want to know why they say “don’t know"——is it because of disin- 
terest, inadequate information? Sometimes you get an automatic routine 
interview and not the true picture. . . . It's not that the person really doesn’t 
know—people may have attitudes. ` 


And later she remarks: 


They’re apathetic—they’re fulfilling their obligation. They get through 
the questions quickly—they don’t listen and it’s easiest to say “DK.” The 
minute you accept the "DK" it makes it casier for them to continue. . - * 
You take a question like the expectation of war. A large proportion will have 


a feeling about whether a war is coming. When I get a "DK" to that, I 
probe. 


F's definition of her role in the interview, of the behavior that is most 
desirable, includes probing extensively, even where the instructions do 
not require іс, It is interesting to note that, in relation to the traditional 
view of ideological sources of bias, the interview results she might con- 
ceivably obtain would appear paradoxical. With respect to one of the 
very examples she discusses, the question on whether Russia can be 
trusted, it is amusing that while she herself thinks Russia can be trusted 
(her general ideology might be loosely labeled pro-Russian), she would 
not be prone to accept a “pro-Russian” answer from a respondent be- 
cause of her belief that respondents often answer in terms of their 
wishes, and that the interviewer should probe to clarify the issue. Such 
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peculiar behavior can only be understood by acknowledging the opera- 
tion of certain role definitions which intervene between the interview- 
ег' own political sentiments and his behavior. 

Now, whether F's tendency to probe is really desirable is not at issue, 
It might well be that probing yields more valid pictures of respondent 
attitudes, and this question will be discussed elsewhere. 

What is clear is that the differing roles that interviewers define for 
themselves with respect to probing, rapport building, recording, etc., 
will account in part for differences in the results they obtain.” It is also 
Clear that there could be fruitful inquiry into the interviewer's general 
view of his job to determine the variability in the definitions given by 
interviewers. The interviewer has to engage in a variety of behaviors 
during an interview and while the role may be prescribed in certain re- 
Spects, there may well even be aspects of his performance for which no 
definitions at all have been established by the agency, and other aspects 
Where the prescription is ambiguous. Where there is no comprehensive 
Standardized definition in the first place, it is only natural for interview- 


ers to vary, Thus MA remarks: 


NA. more emphasis should be given in non-directive р i 
exp 8 Up the levels to which the study director wants the material to be 
Xplored. There is a tremendous lack of consistency in this business of dif- 
erent levels of probing. Many good interviews are wasted on that account. 
ы на be а very good job if they determined at the раш su im 
a аг the probing should go—just how much can be handled in the 
analysis, 


Here certainly there is opportunity by training or field instructions 


Attached to the survey to standardize these definitions or to provide new 
Ones, 


In addition to clarifying existing theories of the interview and of in- 


Series effect, the phenomenological studies had even more а 
Cations for theory and research on interviewer effects. It not only 
sa к тоге complex view of the processes we had е сан. нб 
dec but also brought to our attention woe re in = 7 
idios ton we had not previously been aware of. In t 2 uo о 
t gi Neratic roles as a source of effect, we noted that o s e p 
M o T Interviewer assumed a certain role was because ur given beliefs 

cial, oe of attitudes. F believes that = icem = 
World of ds ш, lies deeper, and Coman гне i e grs 
Mg demonstrati lewer thus assumes importance. 

ation of this: 
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Bias-producing cognitive factors within the interviescer."— Again, let 
us defer any discussion of principles, and insert ourselves into the ex- 
periences of interviewers. Listen to this theme running through the nar- 
rative account by С: 

She spontaneously remarks in the beginning of her account: “The average 
woman thinks only of her job, or if she's a professional woman, of her 
profession. I just don't think the average woman has as much social con- 
sciousness as the average man." Later when asked if she can ever tell how 
a respondent will answer, she remarks: “Yes, you can pretty much tell. From 
the way they start off—right with the first question (you can tell) whether 
they’re going to be a ‘don’t know’ respondent.” And then she continues, 
“Yes, usually you know the garrulous type right from the first.” And 
when probed about predicting attitudes, she remarks: “No, I can’t tell too 
well how they'll stand—except that if you look about the household, or at 
certain types of men, you can tell they’re staunch Republicans.” 


Or take this report from another interviewer, N, clearly a somewhat 


mixed picture, but suggestive of certain cognitive dimensions operating 
within the interview situation: 


When asked if she could make guesses about the attitudes of respondents, 
she replied: “I often get fooled. On Russian 

sciously make such guesses. But if I do that I’m 
think. Therefore I try not to.” 


whether there were any charac 
they start talking, I can predi 
they have, unless you don't h 


ould say they hadn't heard of the Marshall 
s. Very rarely you get a lower income house- 

wife who is well aware of things—they don't have the time." And when 
ves exhibited, she said: *On a series of ques- 
g food to Europe, if she'd said earlier that she 
didn't know about the Marshall Plan, she will be one who wants to take 
no one else.” When the matter was pursued by 


ms of attitudes they exhibited, she replied: “Ig- 
norant, narrow, uninformed. They remind me that А 


Such reports from interviewers w 
beliefs and perceptions about the re: 
interviewer to produce expectations about how his respondents will an- 


Swer questions. These expectations might well be a potent source of bias 
if they were to guide the interview 


; ег at various choice points and affect 
his decisions on probing, recording, classification of answers, etc. This 
suggestion from the phenomenological data was elaborated into a de- 
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tailed theory about the types of such beliefs and corollary expectations, 
and the biasing effects that might follow. The empirical research gen- 
erated from such findings will be reported in Chapter III. 

Attitude-structure expectations.—Certain of these expectations seem 
to be predicated on the belief that the attitudes of any respondent are 
unified, are bound together in some organized structure. Consequently, 
the interviewer would expect the respondent to answer later questions 
I1 а manner consistent with the early answers. As N remarked, “Once 
they start talking, I can predict what they'll say.” This particular phe- 
nomenon might be labeled an “attitude-structure expectation,” and it 
Would seem that interviewers, like most other human beings, would be 
Prone to it. Thus, Ichheiser has stressed the frequency of this belief, the 

tendency to overestimate the unity of personality,” in accounting for 
Misunderstandings between people.” He also suggests that the operation 
of such a belief might well influence the behavior not only of the per- 
ceiver but also of the other person, in our case, the respondent. He sug- 
Sests that there is a “tendency of other people, whether consciously or 
unconsciously, to anticipate and to adjust their behavior in some degree 
to the expectations and images we hold in our minds about their per- 
SOnalities,” 

Many Psychologists have stressed the universal tendency of humans 
to organize and make meaningful their perceptions.” For example, Bart- 
lett talked of an “effort after meaning"? and Asch showed experimen- 
tally how fundamental it is to develop an organized, unified impression 
ОЁ others from only discrete bits of information? Upon presenting 
Subjects with only half-a-dozen adjectives characterizing some un- 
зоа person and asking them to give their impression of the person, 

© always obtained an organized picture. He reports: 
slm a task of this kind is given, a normal adult is Өр of сыншы 
Sequence rection by forming a unified impression. y em Es 
€ of discrete terms, his resulting impression 15 not discri 


m . . H H H 
hee he shapes the separate qualities into a single, consistent view. All 
10925 in the following experiments, of whom there were over a thousand, 


fulfilled the task in the manner described.” 


_That Such expectations might well persist even in the face of contra- 

“tory reports from a respondent during the interview is also supported 

А extensive Psychological literature on the influence of an ша per- 

i ie organization on subsequent perceptions.” One of Asch's experi- 

of a demonstrates this process in a way most relevant to our discussion 
Tviewer effect. 

Wo lists of adjectives characterizing some unknown person were 
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identical in content, but the order of the words in the second list was 
reversed. And the picture of the person reported by his subjects varied 
with the order. This could only mean that the perception was dependent 
not on the mere content but on the initial impression. Asch remarks: 
“When the subject hears the first term, a broad uncrystallized but di- 
rected impression is born. The next characteristic comes not as a зера- 
rate item, but is related to the established direction." 


Direct evidence of this very sort is available from a phenomenological 
account given by an interviewer—B 


as he listened to an electric tran- 
scription of a synthetic interview, which pictured a rather bigoted re- 
spondent but contained occasional answers that were inconsistent with 
the totality of attitudes. His running account of his feelings shows the 
immediate formation of a picture of the respondent and the dynamics 
by which the expectation was maintained despite contradictory answers. 
After hearing the answer to Question 1, he spontancously reported: 


I do have some impressions. The respondent seems very doubtful about 
giving his opinion—a little suspicious. I don't have too much respect for 


this particular respondent. My immediate impression is that he's one of those 
types of individuals who thinks in very personal terms. 


After Question 2, he remarks: 


I was right—immediately he's going off on tangents. He's not rcally in- 


terested in the survey—he’s interested in getting rid of any personal feelings 
he has. I feel he’s an old geezer. 


After Question 2A: 


Everything he says revolves around himself and is increasing my dislike 


of this respondent. . . . I feel hypocritical that I have to encourage him 
even though I don't like him. 


After Question 3: 


That whole thing just confirms my opinion. My dislike grows. . + * I 
already know what this guy is like. I just have to get it down. I feel he's 
“ ыз x GM м ЫЯ 
hypocritical—he doesn’t give а damn about the rest of the Americans, he’s 


just covering up. He just cares about himself—it’s guys like him who cause 
all the trouble. 


At Question 7, the answer on the record was contradictory of the 
previous answers. However, the interviewer, instead of changing his be- 
lief, maintained his original impression and rationalized the contradic- 
tion: 


Minos “5 A ias 
, He's still wary about giving his real opinions. He started to backtrack. It 
gives me a nice insight into his character. 
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At Question 8: 


_ I feel foolish. I know the handwriting on the wall. I know what this guy 
is going to say. He just doesn’t know anything about these things. I feel 
What's the use of asking these people these questions. It isn’t much use asking 
them—after a while I can guess the answers. This guy just doesn’t approve 
of anything outside the United States and doesn’t know anything outside of 


the USS. 


After Question 11, to which the respondent gave a long and mixed 
answer: 


It occurred to me that I didn’t have to listen actively to his remarks. I 
would know what he would say. Wait a minute. I coded the wrong response. 
d + I almost guessed that answer in terms of what opinion Гуе formed of 
the person. 


After Question 13, which asked whether the respondent had heard 


anything about a current issue: 
I'm sure he's 


“4 ] 1 H H H » 
I was just thinking as he said that, *"wou're a damn Паг... - ; 
: 1 d—he hasn't 


Covering up—he's trying not to show his ignorance. I was amuse 
eard а damn thing ‘about it. „ | 
Then I think, "well, what validity has this question got?” He says he's 
heard of it. I have to put it down that way, then I wonder how valid this 
ied is. Is my impression of what he's heard better than > ауе ирге: 
ane йыр, Halfway through I have the impression I know = is ps == 
"s © the way he answered this helped me confirm my ju en I = 
fel k testing it, of asking him—“Are you sure you ve heard of it? a Е 
nae skeptical about the response; I really feel the correct answer is no, DU 
to appear dumb he would answer “yes.” I could almost have predicted 


this Е ‘апе 
answer. He wouldn’t admit his ignorance. 


After Question 15: 


I H . 
Could almost have predicted this answer © 


admit hie ; 
‘well his ignorance. I feel that's true—I can W dow " 
ve", yet Pm not allowed to; I’m limited by interviewing procedure; ma 


little Sore about interviewing procedure, ] feel he's justified when he says, 
Ve answered that already.” It’s true, 1 do know what he’s thinking. 

al data also suggest another 

гіп setting up expectations 

ceive of role expec- 

believe that certain 


o some extent. He wouldn't 
rite down his answers fairly 


foe pxos ctations.—The phenomenologic 
about th clief operating upon the interviewe 
tations € answers of the respondent. We might con 
куш to denote the tendencies of interviewers to ae 
and tl у or behaviors occur in individuals of given group mer ps, 
Sons = to expect answers of a certain sort from чис ar per- 
Pies ome of these beliefs might well occur because of tra itional role 
«у . TIptions characteristic of all societies as illustrated in G's remark: 

Just don't think the average woman has as much social consciousness 
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as the average man.” Some role expectations might well be posited on 
the basis of an oversimplified belief, a stereotype about some ethnic 
group. In either case, at the initial moment of interaction in the inter- 
view, the respondent might be pigeon-holed on the basis of some mem- 
bership cue, and the structure of his attitudes would be expected to 
correspond with that role. | mE А 
One of the case studies of a particular interview situation shows 
clearly the development of a role expectation, in a somewhat stereoty pic 
interviewer, and is suggestive of the actual biasing effects on the results. 
MM, a middle-class, middle-aged white female interviewer in the course 
of her work interviewed a working-class Negro girl of twenty-three. 
The respondent had completed high school and was now married to a 
fireman in a commercial laundry. They resided in Chicago in a furnished 
apartment for which they paid $9.00 a week in rent. Within this situa- 
tion of obvious class and racial disparity, a role expectation quickly 
developed. It is interesting to note that the questionnaire opened with 
a traditional saliency question on "the biggest problem facing the U.S., 
to which the respondent replied, “There are a lo ae 
where there is segregation of the Negro. That's a problem for the US. 
It might well be that accidental factors, such as an initial response being 
“racially oriented,” would contribute to the speed with which an inter- 
viewer would organize the experience in terms of the well-institutional- 
ized roles of social groups. We shall return in Chapter V to the signifi- 


cance of such “situational determinants” of interviewer effect. It is clear 
nevertheless that the interviewer 


around the theme of the Negro res 


t of places in the US: 


quickly organized the experience 
pondent! 


When asked what impressed her 70st about the interview, she replied: 
“The shabbiness of the building, 


the low IQ of the respondent." In response 
to a question, as to the activity the respondent was engaged in, MM in а 
gratuitous attempt to paraphrase Negro speech, noted, “just a settin’.” She 
returns to the concept of *low intelligence" in a number of places in her 
report. Thus, in answer to the question as to whether the respondent was 
embarrassed by any of the questions, MM remarks: “Because of her low IQ 
she felt embarrassed by most of the questions.” And in a number of other 


places, she remarks that the respondent “felt inadequate,” and “felt she 
could not answer the questions,” 


The interviewer structured the situation 
So much in this way that she felt it necessary on the original interview 
blank, after the respondent commented on an information question, “That 
one’s slipped my remembrance,” to make the parenthetical note, “colored 
girl, 23 years old.” When asked later to rate the level of information of the 
respondent, the entry “not at all” informed was checked, and when asked 
to guess what sort of movies the respondent would prefer, MM writes, 

Some light musical comedy or story." 
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There is suggestive evidence that this role expectation did operate to 
affect the behavior of the interviewer. 
. While one cannot deny the possibility that this respondent truly had 
little information and few attitudes, the magnitude of the ignorance 
seems exceptionally great. On three out of four questions on recent 
major political events, the respondent was recorded as “DK.” In six 
Instances on opinion questions, she was recorded as “DK.” Free-answer 
Comments were sparse throughout the ballot. That this seems spurious 
I5 suggested by the contrasting pattern of response recorded by a second 
interviewer who obtained the reactions of the respondent to the experi- 
ence of being interviewed. The re-interviewer obtained very full an- 
Swers. In addition, while the respondent did tell the re-interviewer 
Periodically that “she didn’t know very much,” she also remarked that 
she found most of the questions “very interesting.” And as long as six 
days after the interview, she remembered the contents in sufficient de- 
tail to report with respect to a question on the occupation of Germany 
that it was difficult and that the interviewer had named "3 or 4 countries 
that had troops stationed in Europe. She said if all the others pulled out, 
should U.S, troops stay there." Certainly to remember this rather re- 
note political question so faithfully seems to contradict the overwhelm- 
Ng pattern of ignorance and lack of opinion that the first interviewer 
аам. It seems very likely that the initial interviewer asus. hope 
of issues very much and may have accepted inadequate answers bec 

the general view of the respondent as unintelligent. 

АЦ this would be perfectly natural in the interviewer as а human 
“ing. Psychologists have stressed the prevalence of stereotypes m а 


Population and the persistence of these over time, and this might be the 


re s + : : 37 ге 
Prepared framework for role expectations in the interviewer.” But even 
ht have role ex- 


the a : А : 
pe many interviewers without ethnic stereotypes mig de 
rea tions, Psychologists might conceive of the role-expectationa 
oo as an illustration of the more fundamental law that perception 
2 part is determined by the properties of the whole in which it is 
ple tained. Thus Krech and Crutchfield іп an application of this pane 
a то ШЕ perception of individuals state, ^. - - when an individual is 
PPrehended as a member of a group, the perception of each of those 
laracteristi RA he characteristics 
€ristics of the individual which correspond to the € i 
H +. 9938 
© Broup is affected by his group membership." Sociologists argue 


ог а . . L m 
of fundamental character to such expectations, in seeing regularities 
berships, and expectancies 


behavior 
à Mire corresponding to group mem 
Social t € behavior of persons in given positions or groups, as part of 
reality, almost as a precondition for society. The interviewer as 
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a member of society has some framework of role expectancies built into 
him. | : 

An experimental demonstration of the way in which role expectations 
arise out of racial stereotypes and the regularities of social life is availa- 
ble in the work of E. L. and R. E. Horowitz." The experiment by 
analogy shows how such expectations could create errors in the per- 
ception of an interviewer. The fact that the demonstration is based on 
young children underscores the fundamentalness of such processes. 

White children from the first to the tenth grade living in a com- 
munity in a “Border State," which was characterized by highly institu- 
tionalized patterns of segregation, were shown pictures for very brief 
exposure times. After seeing a library scene containing only four white 
boys reading, the children were asked, “What is the colored man in 
the corner doing?” There was an increasing tendency with age for the 
children to report the nonexistent Negro as engaged in some menial ac- 
tivity. There was a similar increase in the tendency of the children to 
answer the question, “Who is cleaning up the grounds?”, asked with 
respect to a picture containing nothing but a building and grounds, by 
saying that it was a Negro. On a third picture of a beach pavilion with 
tables, the children were asked, “What is the colored girl doing at 
the table at the right?” There is a regular decline with age in the re- 
port by the children that the Negro girl is engaged in nonmenial ac- 
tivity. 

Such demonstrations show by analogy that a strong belief about the 
role that a given group will assume may well influence the cognitive or 
perceptual processes of an interviewer. 

Probability expectations—The demonstration of expectations led to 
theorizing about a third type of belief operative within the interviewer 
which might set up expectations about the answers to be obtained. The 
expectations mentioned thus far develop during an actual interview, on 
the basis of early answers or group membership characteristics of the 
respondent. However, prior to any such cues in the given interview, in- 
terviewers might well have less differentiated and less rigid, but never- 
theless real, expectations about the attitude of any respondent on the 
basis of some belief about the prevailing sentiments in the population 
on prominent issues. This phenomenon might well be labelled a proba- 
bility expectation to denote its statistical content and also its tentative- 
ness in relation to subsequent specific expectations developing within 
the given interview." Unfortunately, no example of this process is 
available in the qualitative materials on the phenomenology of the in- 
terview. The concept developed too late to be explored by these means. 
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How e 
Cer Pe iv = bearing on this, will be reported shortly, and 
| beliefs s ea xs is there are suggestions at least that such 
| Bite, Por кушшо prs inde of sentiments have psychological reality. 
to predier the ре, asked students in a course 1n public opinion research 
great знае n tes results to certain questions. While there was 
à prediction, "ein Ligne uma made in the class, all students essayed 
as social distance — w kh respect to such institutionalized attitudes 
in the prediction с ard Negroes, there was considerable uniformity 
than 25 БЕР вай hus two-thirds of the students predicted that less 
“Would yah t зы population would answer Yes” to the question, 
economic is E IDg Xo have a Negro family in your own social and 
predicted that 1 on in next door to you?”; and over half the students 
for a Negro.” en than 25 per cent ould assent to club membership 
biasing effe i imilarly; in the course of an actual field study of the 
Campbell a T probability expectations on survey results, Wyatt and 
Percentage See Pome! student interviewers to make Тен» of the 
dictions Were ribution of replies to various poll questions. Such pre- 
cal party Шш, and in the сазе of such a public issue as politi- 
the field s May, 1948, in Columbus, Ohio, over one-third of 
Per cent of o that the Republicans would receive at least 60 
nother дет, major-party vote. : | ; 
Product of (ңе ce of such expectations is available as a by- 
of the experiments cited in Chapter Ш. The NORC na- 


tional fi 
: el s А 
; d staff was asked to estimate which answer would be the ma- 


Jority бык 
051 П М 
Һе Position with respect to the question, “In general, do you feel 
our program for Euro- 


Nite А | 

реап es States is now spending too much on 

Owin very, about the right amount, ОГ not enough?", with the fol- 
B results: 


Percentage of 


“T, Field Staff 
“Rint much” would be majority position. .---++ 37 
«NS ht amount” would be majority position... .- 63 
ot enough” would be majority position. «+ — 
100 


Su 
ch А | : 
expectations operating upon the interviewer, whatever their 


Speci 

ific neus Й 

error li. Se content may be, would seem to be obvious sources of 

Underlyi t it is interesting to note that cognitive factors of this type, 
ng the objective interviewing situation, had never been exam- 


ied ы. 
10 prior method А USE den. We had 
en preo \ethodological research on the survey interview. We ha 
With his n ain with the ideological factors within the interviewer, 
Otivation to influence the results, and had neglected his per- 
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ception and beliefs (or construed his beliefs as simply mirroring his 
motivations). We had been concerned with what he communicated of 
his point of view to the respondent, and not with the way he saw the 
respondent. This omission must derive from our historic emphasis on 
the immediate communicational aspects of the interview, and our theo- 
retical leanings toward motivational determinants, Because we never 
entered upon any direct examination of the interview situation, we 
could not correct our view. Out of this emphasis upon the communica- 
tional process in the interview, we saw the interviewer as asking ques- 
tions and recording answers, in the process of which he perhaps com- 
municated information, and we neglected the many judgments he made 
in the process. By contrast, in all research on “evaluational interview- 
ing,” where the interviewer assesses a candidate for some purpose, 
methodological attention has been focused on judgments and the cog- 
nitive processes underlying them, which might lead to error. There we 
find a classic literature on “halo effect” in judgments, and on the in- 


fluence of stereotypes in judging applicants, stressed in relation to inter- 
viewing of this гуре,“ 


It is interesting to note that the one published investigation we have 


ality of Cognitive processes in the in- 


_ It is characteristic of th 
sistent in a degree which often a 
and nature of that part of th 
main, sometimes in a recogni 


it may be stable and per- 
Ppears to be out of keeping with the length 
© encounter which gave rise to it. It may re- 


; when further evidence 


Detection and control of biasing expectational processes.—The phe- 


are also suggestive of the possibility that cer- 
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diced attitudes he encountered, he was asked whether such attitudes 
Were more characteristic of certain groups, to which he replied: 


"Yes, ГЇЇ say this—you find іс more in certain parts of the country. But 
you find it in every area, in every class, in Brooklyn or Atlanta. Oh, it’s true 
that in Atlanta it’s very rare to find a radical.” The matter was pursued by 
asking him if he could tell in advance which people would be like that, and 

€ said: “Rarely. You get used to being surprised. You never can tell. If 
you knew what people would say in advance, you'd be out of business. Гуе 
never been able to tell in advance. Dress, features, manner, income is never 
ап indication of attitude. Sometimes you can make a generalization, but you 
have to be careful... . If you're talking on a political issue and you come 
Into a solidly Republican section, you will find conformity, but you always 
nd exceptions,” 


In a later discussion of the gratifications he derives from interviewing 
he reports: 


“I get continuing gratification from the simple realization that people are 
different from one another. Гуе run into such peculiar combinations о 
attitudes, When you find apparently varying sets of opinions within the 
Same individual, it’s apt to jar you enough to realize once more that you 
never can tell. I find it a continuing wonderful thing. You don’t run into 
8toups or patterns, It may be true in some basic attitudes that large groups 


are influenced by the same things, but in many other attitudes, you find in- 
Consistencies,” 


is "a the unrelated context of a discussion of how he knows when he ot 
Invalid, he states: “I don't know unless it's the tone of voice or the ma 
пег, If it’s a long and overlapping type of questionnaire, you can detect 
Cutright inconsistencies. But the most honest individual in the world gives 
Conflicting answers unless he’s an extremely well-integrated person and has 


all his attitudes thought out.” 


And M, in discussing his behavior and experiences, suggests that a 
Гоп task orientation, an attention to the required detail, prevents his 
y ming such expectations. It is suggestive also that M was the ыс: 
ao With relatively little intrusiveness or social orientation towar 

he Tespondent, and perhaps this prevents him from synthesizing im- 
| A “sions. Thus, in the context of a discussion of his probing a An 
Vhen asked whether certain types of probes were more effective for 


iven types of people, he replied: 


АЦТ can say is I haven't discriminated. I ca 


that, ү ; 
this t takes a person of different mentality than 


n’t contribute anything on 
mine. In general, I can say 


yo У interviewing, I don’t generalize consciously about ш — ү 
couldn to ask me at the end of a survey how most peop e answ es 

answ, nt tell you. I couldn’t discriminate, for example, that younger women 
Sred such and such a way. When I’m with a person, you re pretty ab- 
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sorbed in getting what they say. I'm a tabula rasa. I don’t give a damn. Tm 
not thinking. I’m just a recording machine. It helps me in my objectiveness. 

Granted that we find in our later experiments that expectations are 
potent sources of bias, the qualitative material on individual differences 
among interviewers in their susceptibility to expectations will lead to 
an important area of research. If there are such biasing tendencies, vary- 
ing among interviewers and related to given factors, it may be possible 
to detect them by a variety of means and select interviewers who 
would be less susceptible. While such a testing approach goes far be- 
yond the present project, existing. psychological theory gives some 
guidance in a search for the nonsusceptible interviewer. The volumi- 
nous studies on stereotypes about ethnic groups might provide clues 
that would differentiate interviewers less prone to role expectations. 
With respect to attitude structure expectations, literature from experi- 
mental work on perception is most useful. Thus, in Thurstone’s fac- 
torial analysis of perception,” one of the radical factors inferred was 
that of “speed and strength of closure,” certainly akin to the attitude- 
structure expectation phenomenon, and many writers have talked of 
such polar approaches to perceiving the world as the synthetic vs. the 


analytic type, the former somewhat akin to the pure attitude-structure 
prone interviewer.” 


More recently Frenkel-Brunswik'* has argued that “intolerance of 
ambiguity,” the inabilit 


y to accept the existence of conflicting or con- 
tradictory or complex elements in some object and to be flexible in per- 
ception, is a highly general formal characteristic of the individual, 
rooted in the personality. Those who are intolerant of ambiguity would 
obviously be prone to attitude-structure expectations as interviewers, 
and if this truly is a pervasive characteristic of the individual, it could 


be more easily located. We might well find certain simple perceptual 
tests of this general tendency.'? 


p QUANTITATIVE DATA ON THE DEFIN 
INTERVIEW SITUATION 


The case study material was rich in suggestions of new 
g at the interview situation and 


ITION OF THE 
: ways of look- 
in led toward fruitful theory about the 


mechanisms underlying bias, the barriers to bias, and the correlates of 
bias. These theoretical insights were ul 


experimental means, the results of Which are reported in later chapters. 
The reader may have felt that some of the phenomena described 
Were exotic— existed only in occasi 


Viewers and respondents or in the 
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Moreover, even if such theory about the correlates of bias is verified 
experimentally, this would provide no evidence on the generality of the 
process. The experiment would simply prove the precise operation of 
c ow on bias but could not establish the generality of such ef- 

t the usual survey. Therefore, it would be desirable to have some 
notion of the usualness or unusualness of these processes in the inter- 
Viewer and respondent. 

In this section we present data on the freq 
and respondents of some of the phenomena 
instances, cross-tabulation of the reported 
some preliminary test of a theory about the biasing € 
phenomena. i 


uency among interviewers 
already reported. In some 
phenomena also provides 
ffects of such 


General Detachment of Respondents from the 
Opinion-Giving Process 

к. > evidence from some of the case history material was that past 
the "s may have overemphasized the intensity of the experience for 
suive Sendung of being interviewed on many current public opinion 
ме н The material suggested that a respondent may be so nan-in- 
ing the ri € SpInIOD-BIVIDE process that he is not concerned ox к= 
would n right answer” or pleasing the interviewer or anyone else. ; йн 
Béempions preclude other kinds of bias, e.g., the biasing effects : ex- 
Somat tie s on the interviewer's handling of the data, but it wou d 7 

the con, sensitivity of the respondent to the interviewer s opinion, an 
It ma munication of cues about the interviewer s attitudes. , 

ing Г: appear perverse to argue that such a phenomenon is a Dus 
Support ir дова good thing from the point of view of et pi ic 
cratic Fh the institutions of interviewing, survey research, an Eu 
the ane or from the point of view of 2 seriousness e 
the velie: wir um in surveys. It is Sus a good eg км 
act that "à = п Ij ое ble. ae Certainly there 

BE nare. studying the wrong problems at times. $ 

haps veil ies about which respondents must be intense, and per- 
that do n ave neglected these for the study of the very kinds of issues 
narrow ot Concern people. But it may well be a good thing from the 
effects E oint of view of the reduction of certain types of interviewer 
a wides current surveys. Some quantitative evidence that this is truly 
is Кау hat uninfluenced by transient events, 
е. 
Че =” ren) reports data on the attitude of respondents toward 
to the experience of being interviewed, as revealed in a 


th 


d phenomenon, somew 
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special questionnaire administered by NORC to a national sample of 
Americans.’ While he shows clearly that there is little in the way of 
strong criticism or hostility to public opinion polls among those who 
consent to being interviewed,” he also shows that the general reaction 
of a considerable portion of the public might be loosely described as 
“lukewarm.” Thus, while two-thirds of the public expressed the view 
that polls are a “good thing for the country,” 18 per cent of the sample 
said public opinion polls don’t make any difference one way or the 
other, and 10 per cent had no opinion at all about the polls. And among 
the favorable individuals, there was little clarity in the reasons for their 
sentiments. Ten per cent of the favorable respondents could proffer no 
reason at all why they regarded polls as a good thing, and 35 per cent 
could only remark that “they show how people feel” or “it’s nice to 
know what people think.” And those who were not favorable essen- 
tially revealed a pattern of indifference, as indicated in the main reasons 
they gave for their sentiments—“Politicians, leaders pay no attention to 
them” or “They’re just opinions, don’t settle anything.” While three- 
quarters of the public reported that they would be favorable to being 
interviewed again, most of those expressed no enthusiasm; 54 per cent 
merely saying that they had “no objections.” This sample was also 
asked if they had ever been approached for an interview on a previous 
survey. And among those who reported a previous experience, certainly 
the most “favorable” group to the process, since they have doubly con- 
sented to be interviewed, 38 per cent described their reaction to the 
previous experience as “no criticism, but no special enthusiasm.” 
These data had been collected in 1947, and comparable data were 
again collected on a national sample in 1948, shortly after the widely 
publicized failure of the polls to predict Truman’s victory. While 
Sheatsley clearly shows that this event did reduce support for the in- 
stitution of polls, from our point of view he also shows that “luke- 
warmness" is a characteristic pattern. Thus, while the proportion of the 
public who expressed the view that polls are “а good thing" dropped 
from 66 per cent to 47 per cent, those who frankly said polls are a “bad 
thing” rose only to 6 per cent, and the major increase was in the indif- 


ference category. Certainly one might have expected that the public 


would show widespread hostility or derision following such a failure, 
but by and large this did not oc 


by and cur. People don’t get that excited about 
the institution! 


In 1950, the reactions of a national sample to an NORC survey were 

1 52 
dg ascertained." So that respondents would feel easier in reporting 
their genuine feelings, a written questionnaire was handed to the re- 
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spondent at the end of the interview, completed by him, and returned 
to the interviewer in a specially prepared sealed envelope. One question 
asked whether the respondent thought that obtaining people’s opinions 
in public opinion surveys was useful. While this general procedure and 
the particular question wording were different from Sheatsley’s, the 
data support the view that “lukewarmness” is a stable and widespread 
Pattern, While 60 per cent felt it was very useful to obtain people’s 
opinions, 10 per cent said it was of little or no use, and the remaining 30 
per cent said it was “somewhat useful.” These results run quite parallel 
to Sheatsley’s 1947 findings. 
The re-interviews with respondents, use 
пе case histories previously reported, also provide some meager evi- 
ence on the frequency of detachment among respondents. While this 
sample contained only fifty cases in selected cities, it is noteworthy that 
m one-quarter of them said either that the questions were of no in- 
at all to them, or that only some questions Were interesting. With 
respect to the point made previously, that respondents may not feel any 
embarrassment about their particular opinions or lack of opinions, the 
Fe-Interview procedure is unique in affording some quantitative state- 
uy of the magnitude of such equanimity. A battery of questions in 
_Te-interview related to this problem, and an overall reading of the 
four ipe s was used as the basis for rating the dip ees 
“not Fr cain c pce — of the ensis ki es e 
many had ziv ous" about their opinions; îs P Аар 
survey given uninformed answers or no answer at ali in g 
E ds of documentation of this latter € ung өн. ж. 
simp] ats who were at best able to answer correct y only one о | 
ipl information measures, dealing respectively with Acheson's ap- 
P enr as Secretary of State, a nationwide address by President ipod 
their in the Dutch-Indonesian conflict," eight of them were ps y 
esa Nterviewers as “satisfied with their answers, and five of them 
Ported that they “understood all the questions.” 


d as a basis for constructing 


Detachment of the Respondent from the Social Aspects 
of the Interview 


y , 
he case material suggested that, because of th 


Port, 5 ‘ = 
so, Sheer apathy, egocentrism, violent hostility, 


e lack of strong rap- 
or cynicism, the re- 


pon ‚ог су h 
аа may remain rather detached from the interview experience. 
this sins may not have too much interaction with the interviewer and 


Y : - : 
vould reduce the operation of one kind of bias. Some evidence on 
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the frequency of such detachment from the interviewer is available 
from a mail questionnaire administered to the nationwide staff of inter- 
viewers of the National Opinion Research Center.™ If we can regard 
the interviewers as accurate informants about their respondents, and 
certainly in this area there would be no conscious reason for them to 
report in a biased way, they suggest that respondents are not very in- 
terested. The question asked of them, and the marginal results are re- 
ported in Table 2. 


TABLE 2 


ORIENTATION оғ RESPONDENTS TO THE INTERVIEWER as REVEALED IN 
THE Reports ОЕ THE NATIONAL NORC Fretp Starr 


"In general, thinking of most of the respondents you interview, 
would you say they are very interested in you yourself—your 
opinions, your work, your background, your family—or are they 
only mildly interested in you yourself, or don’t they take any 
personal interest in you at all?” 


Percentage of 
Total Field Staff 


Most respondents very interested in interviewer, , . .. 17 
Most respondents mildly interested. ..,.,,, = 63 
Most respondents show no interest аба lees 20 
100 

N = 150 


Additional evidence on the indifference of respondents to the social 
aspects of the situation is available from the re-interview study reported 
earlier in this Chapter. In their replies to a direct question as to whether 
they liked the interviewer, twenty-one of the fifty respondents said 


they *had no feeling about him at all"—they neither liked him nor dis- 
liked him. Я 


if so, whether his opinions were the same 
wn. Over three-quarters of the answers 
Were that the interviewer “didn’t seem to have any opinions of his 

en reported the bizarre reaction of a number 
of his respondents who, after reading this question on the form, asked 
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ree expect that respondents who are keenly concerned about 

T ` matters would sense the existence of opinions in the interviewers. 
So s = respondents were aware of the existence of inter- 
ман x же = by and large, they showed little insight into the actual 
aus be aet т This is not to say that this aware group may 
Биг оз , м — conceive to be the interviewer's opinion, 
аа ER р ave not sensed his real opinion, or that the inter- 
Mites А aske е real opinion. This can be demonstrated by cross- 
the ann пег д ers as to whether the interviewer's opinions were 
el y roe a» * sient from, their own against objective evidence as 
dier in. gas, Jer een interviewer and respondent opinion. Since the 
a theres а complerd the same questionnaire as was administered 
ое т m ents in n survey, it was possible to sort out two groups, 
with aa : ents interv iewed by interviewers who actually agreed 
interviewer: a general question on the survey, and those where the 

isagreed. The evidence is presented in Table 3. 


TABLE 3 
; ABOUT INTERVIEWER OPINIONS AS 
гр то THE Onyective Disparity IN OPINIONS 


Амохс Resroxpents Wo WERE 
Per Сех "но [NTERVIEWED BY INTERVIEWERS WITH 
ENT REPLYING Tuat WERE ACTUALLY: 


Isterviewers Hap: 


| Different 
Same opinion........ TT 19% 23% 

Different opinion....... : 2 1 
Noopinion. ............ | 79 mo 
| 10096 100% 

№ = 472 М = 446 


opis ваг thet there is no relationship between the actual disparity in 

ibat E the perception of disparity. It is interesting also to note 

Opinions s the small group who sense the existence of interviewer 

en, there is overwhelming belief that the interviewer 1s not in 
ment, 


Detachment of Interviewers from the Situation 


„а material reported earlier suggests that past theorists may 
in limes URNA the intensity of the motivation of the interviewer to 
ments ey i respondent, or the intensity of his reaction to the senti- 
volved T e by the respondent. Interviewers may well be highly 

eir job and very concerned with the issues studied, but 
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this interest is not focused on the specific interplay with a given re- 
spondent. Quantitative support for this revision of theory is available 
from the results of the mail questionnaire administered to the nation- 
wide field staff. 

Thus with respect to a question asking the interviewers to rate for a 

variety of purposes the importance of public opinion surveys, the pur- 
poses emphasized by about two-thirds of them were “institutional,” 
service to scientists, or service to the democratic process, and not the 
value of the interview to the respondent. It is true, however, that one- 
third of the total staff stated that the use of polls “to educate the people 
who are interviewed" is a “most important" function. But over half of 
the staff felt that it was not the interviewer's responsibility to educate 
an uninformed respondent, even when the respondent desired to con- 
tinue the discussion after the formal interview was terminated, and 80 
percent of the staff felt it was not their responsibility to enlighten a 
prejudiced respondent, even if he wished to continue the discussion 
after the interview. Two-thirds reported that they do not feel privately 
irritated by a respondent's opinions. That the general orientation of the 
interviewer might be described as a “Task Involvement,’ and not 4 
“social orientation” to the respondent or an affect-laden experience, is 
also clear from other data. A majority report that they only occasion- 
ally or hardly ever would enjoy staying on to chat with their respond- 
ents. Only a tiny minority report that they have frequently made 
friends with a respondent. About half of the staff reports that there 
were no particular questions on past surveys which they would have 
preferred not to ask—despite the fact that NORC’s past surveys have 
covered questions ranging from personal financial matters to experience 
with mental illness and questions about sex. 
_ With respect to the question as to whether they would object to ask- 
ing certain hypothetical questions of respondents, most interviewers ге- 
port that they would not strongly object to inquiries into the most 
sacred areas. They seem to regard the interviewing process as a job— 
no matter what the content. Thus only tiny minorities report that they 
would strongly object to asking the respondent, “Has anyone in your 
family been in a mental hospital?” or “Do you think masturbation can 
cause mental illness?” and only about one-quarter report strong ob- 
jections to the bizarre question, “Have you provided for the Salvation 
Army in your will?” 

In this connection, it is most interesting to note that interviewers ОС- 
casionally reported as their chief failing the fact of their social “over-i!- 
volvement” in the interview situation. They were asked early in the 
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mail 7 : 
Ma e the өреп question, “What would you say are your 
емін oe ee iewer?” Certainly nothing in the literature of 
failing if etum oe suggested that this would be regarded as a 
riso Me олса s the m of high social involvement would have 
Spontaneously re dt trait. Y et, 10 per cent of the interviewers 
They say: "Im jm ци their chief failing is “over-involvement.” 
many people dh sympathetic," "T like people too much,” “Too 
to keep ка la i to me about personal problems,” “A disinclination 
ДҮЙ sm E i nt precisely to the subject 
аеннан. Te hem suggests that his failing 
being жү iir m be that interviewers have learned the wisdom of 
ciently and as E Du as a basis for carrying on their work effi- 
ence has been = Б егт. e against bias. But this wisdom from experi- 
Е Bebe —— cted in the prevailing body of theory about inter- 

u . 

Менес i ern rd an inferential sort on the detachment of inter- 
viewers, we ч е from the questionnaire administered to all inter- 
бай. Amone th questions were intended as indicators of personality 
g these was a question specially designed to measure the 
iewer. As in all personality inventories, 


genera] “ ERES 
io al sociality" of the intervi 
asur 1 "ihe 1 pe 
res take on clearest meaning in relation to statistical norms. 
y administering the same 


In thie ; 
this i 
к сайн ae norms were constructed b 
о a national sample of respondents. In Table 4 are presented 


” 


lies in his Jack of social 


TABLE 4 


So r 
ciaLiTY оғ NORC Srarr АЗ CoMPARED WITH CoLLEGE 
EpucarED WoMEN IN А NATIONAL SAMPLE 


of intimate concern to you, do 


"In dealing with problems 
le, or do you 


A prefer to talk them over with other реор 
prefer to keep them to yoursel d 


Percentage of 
“Norm” Group in 
National Sample 


Percentage of 
Interviewer 


Population 
н with others. ....-- 38 e 
eeptoself. ........ 62 a 
100 100 
Neis N= 90 


the distribue 
the asa a of answers among the interviewers as compared with 
the gium, the college educated women in the national sample,” 
сйс п group most like interviewers in general characteristics. 
xamination of marginals, in which it 1s noted that two- 


third. 
S of the i 
e : . . 
interviewers are not "sociable" suggests that our tradi- 
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tional views have been in error. However, in relation to the norms, it 
is dramatically demonstrated that interviewers are not as sociable as 
their counterparts in the population. This would suggest that their in- 
volvement in the social setting of the interview would not be as great 
as it was presumed to be in past theorizing. 

Occurrence of expectational processes.—A variety of measures from 
the mail questionnaire suggest that such processes are frequent in oc- 
currence, although not characteristic of a majority. Thus, as a measure 
of "role expectations," interviewers were asked, “How often do you 
feel you can size up the respondent and predict most of his answers in 
advance?" A little over one-third of the staff reported that they could 
do this half the time or better. However, when followed by an open 
question asked of everyone as to the cues used in building up role ex- 
pectations, only a small minority flatly answered that it was impossible 
to predict the answer. Admittedly this question is loaded in the direc- 
tion of increasing the estimate, but the very high figure is nevertheless 
striking. The detailed cues used in such expectational processes are re- 
ported in Table 5. 


Further evidence of the operation of expectational processes was 


TABLE 5 
Facrons ENABLING INTERVIEWERS то PnEpicr RESPONDENTS’ Answers 
" What sort of things about the respondent help you predict his answers?” 


Percentage of All 


Interviewers* 
Role factors 

Economic level: class, occupation, home, neighborhood. . 54 
Nationality, religion, ethnic group 6 
ur re аланнан M 11 
рено Аа an 4 


Attitude-structure factors 


Education, intelligence, interest in subject... 17 
Co-operativeness: initial response to interviewer 


16 

Answers to first few Ччезпопз...........__ 11 
Respondent's attitude toward the interview situation 10 
Personality factors in ux E 10 
i Peek P ptr C а 4 
Р tigudise, ка ка, iai rm aiii 13 
Don't try to predietydon't lg, ууу. men rei en remet 17 
N= 151 


* Percentages total more than 100 because of multiple answers, 
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furnished by interviewers in connection with an experiment on coding 
in which interviewers were asked to code answers under two condi- 
tions: first, with the answers to a given question isolated from the to- 
tality of answers to the questionnaire and, secondly, with these answers 
imbedded in the total context of answers." In conjunction with the 
experiment, interviewers were asked what elements іп the normal field 
Situation aided them in classifying difficult or ambiguous answers into 
а precoded category. About one-third of the interviewers reported the 
Use of contextual aids of a stereotypic sort, such aids being almost pure 
examples of expectations predicated on the general characteristics of 
the respondent. For example, one interviewer remarks: "If he is an 
ignorant person, I judge his answer on the fact that he doesn't really 
know what the question means and I often put ‘don’t know’ for this 
type person.” 

Another source of evidence on the frequency of expectational proc- 
esses is available from a question asked in Elmira in the 1948 Election 
study, Respondents were asked to estimate how given population 
Stoups would be likely to vote. Since the interviewers filled out ques- 
Uonnaires also, the answers to this question provide an estimate of role 
€xpectations. The interviewers completed these questionnaires prior to 
the first wave of interviewing in June. Consequently, the estimates of 
Tole expectations revealed in the tables below are conservatively stated, 
Since the interviewer is predicating his judgment prior to the campaign 
and Prior to the choice of presidential candidates. It is logical that such 
beliefs would be even stronger at later dates closer to election day. In 

able 6 below, selected data are presented on the frequency with 
Which interviewers expect a number of population groups to vote In 
Some systematic direction. Also presented is the frequency with which 
interviewers checked the alternatives: "don't know" how the given 
&roup will vote, or the group “will not vote as a bloc.” This latter sta- 
Ustic gives an estimate of the rejection of role expectations. 

It is clear that over half the field staff had a role expectation of a 
“niform sort for each of the four population groups presented, and that 
only about one-quarter of the staff rejected expectations of this type. 

Analysis of the Elmira data on role expectations supports the sug- 
Sestion of an expectation-prone interviewer. If we intercorrelate the 
Mterviewer’s report of, or the rejection of, role expectations for each of 
vice" population groups, we can determine the сотоінатоу сї ише 
P iig proneness. High consistency wou strengthen the ! а 

© 15 some stable pattern within the interviewer making him prone to 
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such processes. The six correlations range in value from .38 to .87 with 
a median value of .59 suggesting a fairly strong tendency for the inter- 
viewer either to reject consistently the notion that the voting of these 


groups can be predicted or to expect them to vote in some particular 
fashion.** 


TABLE 6 


Interviewers’ BELIEFS as то Уотіхс Benavior or Various Groups 
IN Poputation* 


Percentage of Interviewers 
Believing That 


Rich people will vote predominantly Republican. ......... 76 
Factory workers will vote predominantly Democratic. .... . 55 
Farmers will vote predominantly Republican............. 55 
Poor people will vote predominantly Democratic.......... .58 
N 2 33 
Belief That Following Groups Will Not Vote as 
Bloc or Don't Know How Groups Will Vote: Percentage of Interviewers 
21 
27 
24 
27 
N = 33 


* These data were made available through the courtesy of the 1948 Political Study of Elmira. 


In the discussion of the case material on expectational processes, it 
was noted that even among the small number of interviewers studied 
there was a variation in the proneness to such tendencies. Certain con- 
jectures were advanced, based on the material and on a larger body of 
theory as to the types of interviewers who would be prone to such 
processes. The mail questionnaire affords some more reliable evidence 
on personality factors correlated with such expectational processes. 
Certain questions were asked which might be used as diagnostic indi- 
cators of stereotypic traits. 

F our measures from the F-Scale of the Berkeley Study of Authori- 
tarianism which had been found empirically to correlate with stereo- 
typy were asked of the interviewers,” These asked the interviewer 
whether he agreed with statements on the inevitability of war, the de- 
sirability of a strict leader, the desirability of severe punishment for 
sex criminals, and the strict rejection of pre-marital sex relations. The 
answers to these questions were pooled into an index, those disagreeing 
with three or more of the items being classified as “non-stereoty pic.” 
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ee of this index against the questions designed to meas- 
1 " expectational effects provides some evidence. The data are presented 
elow. 


TABLE 7 


Tur RELATION or STEREOTYPIC PERSONALITY TO EXPECTATIONAL 
Processes IN THE INTERVIEW 


Can Predict the Answers 


Respondents’ Answers Generally Split 

Half the Time or More Along Class Lines 
Stercotypic RT 44% 44% N=63 
Non-stereotypic. . .. 30 37 М = 88 


in i diia. d of respondents asa function of tbe personality of the 
kinds st RN case material was suggestive of the fact that certain 
Señsitivit os labeled intrusive, are likely to increase the 
алшы о Шш respondent to the social aspects of the situation. More 
ај тн ev idence in support of this suggestion is available from 
Ree a ulation of replies to the mail questionnaire. Certain measures 
Ше Masi a to reveal the social orientation of the interviewer and 
d i» be tabulated against the measure of the frequency with which 

Wers reported that respondents were keenly oriented to them. 


h 
ese data are presented below: 


Tur R TABLE 8 
HE М 
ELATION or MEASURES or INTERVIEWER INTRUSIVENESS то ResponDENT BEING 
SOCIALLY ORIENTED TO THE INTERVIEWER 


Per Cent Who Report That 
Respondents Are Very Inter- 
ested in Them Personally 


үе ——=—= 
Among i ; N 
chatting nas who very often feel like staying and 28% 2d 
n i 
Y occasionally feel like staying and chatting 10 EA 


mong i: i 
n n! i 
8 Interviewers who feel some responsi 
24% 67 


On't feel res 
: 13 83 
VADE inter 
‘Iced res 30% 30 
оп? feel t 14 120 


ewers as a function of cogni- 


Variati 

i ; z T 
ations in roles assumed by intervi 4 
differ in their views of their 


tions 
—Some evidence that interviewers 
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proper function in the interview is available from the mail questionnaire 
administered to the current NORC staff. The open question referred 
to in the foregoing, on their chief failings as an interviewer, yields some 
evidence on the degree to which they regard probing as desirable or 
important. While the answers cover a wide range of behaviors, it 15 
interesting that the two most frequent failings reported referred to con: 
trasted functions within the interview, “not probing well or enough 
vs. “general carelessness or difficulties in writing.” Those who referred 
to each of these areas to the exclusion of the other numbered 21 per 
cent and 23 per cent respectively. 

A specific question was also asked as to the preference for handling 
surveys that contained mainly open questions requiring probing, rather 
than surveys containing mainly pre-coded questions. The split is almost 
even, with 55 per cent preferring the pre-coded type of survey. 

That this latter variation in orientation to the job is partly a function 
of beliefs about the nature of attitude can be inferred from the reasons 
interviewers gave for their preferences for pre-coded questions vs. free- 
answer questions which involve probing. No matter what the prefer- 
ence, the predominant reason given reflected some belief as to the 
nature of attitudes. Thus, among those interviewers who preferred pre- 
coded questions, 25 per cent gave as their reason, “respondents aren't 
articulate enough, don't make answers consistent, can't back up their 
opinions." Among those who preferred free-answer questions, 33 per 
cent claimed that “this comes closer to what people really think and it 
gets at people's real feelings" and an additional 18 per cent gave the 
clearly related reason, "the respondent feels freer and gets a better 
chance to express himself." These figures give a conservative indication 
of the cognitive basis for preference for a given interviewing role, since 
some of the other categories of reasons did contain answers bordering 
on beliefs about the nature of attitudes. However, since these categories 
were less clear, they have not been lumped with the above. 


3. THE VALUE OF A PHENOMENOLOGY OF THE INTERVIEW 


A Framework for the Evaluation of Quantitative 
Data on Interviewer Effects 


Let us imagine what this study would be like if Chapter II had not 
been written. In Chapter III, devoted to sources of effect within the 
interviewer, we shall see that the most strenuous experimental study 
failed to reveal any “ideological bias” in the sense of systematic distor- 
tions of respondent attitudes in the direction of interviewer opinions, 
operating uniformly over all classes of situations. In Chapter VI, on 
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es ie jor in usual survey operations, we shall see that 
the акш. a eld experiments revealed negligible differences in 
Con amet 2 by different interviewers on a variety of questions. 
ua modici i findings, one might have rejected the evidence 
cal,” rm к 5 Кене flaws ог evaluated it as “unusual” ог “atypi- 
traditional ei Нр спее SEEMS so contrary to past research and to our 
is be Tod cp» of interviewer effect. Any research project is bound 
RS acum е vo size, and the reader can always reserve his judgment 
juta ers that another experiment will reverse the verdict. But the 
ст ан ion of these necessarily limited quantitative studies with the 
аи гад. materials on the nature of the interview situation should 
male ders confidence in accepting these findings and, in addition, 
Bien! tee fer understandable what might otherwise appear a 
II. We es ТЕ ainable finding. Here is one obvious function of Chapter 
reported egin to understand the experimental findings that will be 

D ed and evaluate them properly. 
Pi Seen on the plausibility of maj tal findi i 
Sen r view of interviewer effects, we might, as just indicated, have 
M ens or rejected the findings. But buried under these main findings 
Ma DU the general unimportance of ideological bias—was the 
condition. " specialized interviewer effects occurring under kei 
give oni ut under what conditions? Here the qualitative amanda 5 
facilitate ring They hint at the special circumstances that hinder or 
the direct ле operation of biasing tendencies. And, in some e: 
We mi be in which they lead analysis is exactly contrary to the path 
cal dila m taken. Thus, for example, if we rt pra 
spondents a here differentially great in pmi s gr е ча 
ет the a the i m ees yc diated for such indi- 
ae would h =e 2: rng ide: аиры ; be more 
Suggestible B аме ess: conviction ап would р € y v 
Very safe . But the qualitative materials show that apat hy is one o : 
and that E against the interviewer's opinion being ur ir sn ; 
situation 1 cological bias may occur essentially as a task aid when the 
the аре. difficulty in performing given assigned penons es 
Opinions , Оор create such difficulties, since their opinions and lack 
Such mb unequivocal. | b ai 4 
Yieldeq a led to a more refined hypothesis, whic h, when aa ; 
apathy i e: evidence ofa curvilinear rèlation A: а ent 
Seemed to ы that is, both apathetic and highly © s pu ents 

Dother ex ess affected than the “somewhat involved group. 

ample of the development of more sophisticated models 


or experimental findings in rela- 
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of the operation of ideological effects is presented in Chapter Ill, where 
we sought the differential occurrence of ideological effects among M- 
terviewers who anticipated difficulties in handling certain questions—a 
lead which came from the discussion of situational factors and the dis- 
ruption of roles. 

Similarly, Chapter V, on the influence of situational factors in inter- 
viewer effect, grows out of the evidence that the interviewer is usually 
predisposed лот? to bias the data, and that a variety of pressures disrupt 
the normal pattern and invoke the biasing tendencies. Chapters III and 
V now incorporate a series of experiments into the influence of such 
factors. 

But these chapters by no means exhaust the respective areas of re- 
search into expectational and situational factors in interviewer effect. 
Nor does this total manuscript exhaust the problem. Further tests are 
called for. With respect to such future research, a host of new hypothe- 
ses can be generated from the qualitative materials. 

Finally, apart from the relevance of these qualitative materials for 
research into interviewer effect, there is a relevance of the findings tO 
the general operations of public opinion agencies. We now acknowl- 
edge that attitudes are not independent of the circumstances within 
which they are liberated. We shall be better able to interpret the mean- 
ings of our voluminous findings on American public opinion in the light 
of knowing a little better what the situation is like in which respondents 
voice these sentiments. We generally have little but the recorded words 
from which to draw our inferences. The case materials in Chapter П 
give us some feel for the relation of respondents toward the social 
world about which they are so continually questioned, and toward the 
interview situation in which they voice their sentiments. 


CHAPTER III 


Sources of Effect Deriving from the Interviewer 


1. THE NATURE OF EXPECTATIONAL PROCESSES 


The phenomenological data in the previous chapter showed clearly 
that interviewers frequently have certain beliefs about their respondents 
which produce expectations as to the answers that should be elicited 
to the questions in the survey. While the existence of what we have 
called role expectations, attitude-structure expectations, and probability 
expectations was supported by considerable qualitative material, only 
Suggestive evidence was presented that such expectations actually affect 
the behavior of interviewers in such a manner as to alter survey results. 
Mor cover, no evidence was presented that any alterations in the results 
deriving from such expectations would lead to less validity in measure- 
ment. The possibility might be entertained that the interviewers ex- 
Pectations have a foundation in truth and consequently enhance validity. 

herefore, it now remains for us to present convincing experimental 
evidence on actual expectational effects and their contribution to error. 

In so doing, we should not be too hard on the interviewer or make him 

far exclusive responsibility for such behavior. Role and attitude-struc- 
їше expectations among interviewers may merely reflect larger scien- 
tific emphasis upon determinism, since these expectations build upon a 
Concept of regularity in behavior. Kluckhohn brings this interpretation 
to our attention in the course of a discussion of life-history materials in 
nthropology.? He suggests that factors of an accidental or idiosyn- 
— Tun are usually neglected in ie pe pon alis qu Li 
ynamics and sees this as part of a larger te y 
estern Science to abhor “chance.” He remarks: 


Th 
himas padle 


Crucial in de 


termining whether one’s life procee 
Va: E Б 
Tlous possil 


ble courses. 


on luckhohn then emphasizes that the re 
iac. 10 the significance of such accidental factors and uses words almost 
Sntical with our description of role-expectation effects: 


hend the total personality of the 
the various masks, temporarily 


belief in regularities can blind 


T 
i E analyst who wants to really compre 
ant or revelant must “get behind” 
83 
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stripping off (but not forgetting) the layer which is the totality of responses 
expected of the subject (for example, as old man . . . as grandfather, etc.). 


In addition, such expectations, since they are expressive of tendencies 
to organization of perception, are fundamental psychological processes. 
Since they often involve the ordering of people by certain categories, 
they are in the very nature of society. Much evidence in support of this 
view has already been presented in Chapter II; Oldfield in comment- 
ing on the expectations he observed in his interviewers similarly stresses 
this larger context. He remarks: 


Lastly, we have to consider briefly certain special aspects of the construc- 
tion of the homunculus (representation—image of candidate). It would, I 
think, be incorrect to suppose that this process occurs of itself ab initio. We 
all possess certain generalized frames of reference in regard to which other 
people are assessed, and it is fairly plain that to a greater or less extent these 
are involved not only in making judgments about the completed homunculus 
but also in its construction. That is to say, there exist for each individual 
ready-made skeletons upon which the homunculi are built, and into which 
the impressions of their human counterparts are fitted. This process repre- 
sents our tendency to assimilate people to types. It has the advantage О 
reducing the time required for the building of the homunculus. But if the 
number of such standard skeletons is severely limited, this also possesses 
certain obvious disadvantages.” 


Prior to the presentation of the evidence, however, it is important to 
clarify a theory of such effects. Such theory will guide us in interpret 
ing our experimental findings and will provide more comprehensive un- 
derstanding of the total problem than our necessarily limited quantita- 
tive evidence. 

"That expectations of some order, no matter what their specific con- 
tent, do exist among interviewers seems unquestionable. That their 
biasing effects on the data would be unconstrained is questionable. 

In survey research, the specific interviewing procedures prescribed 
for the interviewer tend to check the arbitrary exercise of his expecta- 
tions. For example, the “rules of the game” require mechanical record- 
ing or coding of what has been said and the exact adherence to question 
order and wording. For example, the rule to record the respondent’s 
words verbatim and to code a reply in the answer box that most nearly 
corresponds to the actual words reduces the biases arising even when 
the interviewer holds contrary expectations. 

That such legislation over the interviewer is not merely on the books; 
but actually exercises some control, is clear from the material presented 
in Chapter П, where it was shown that an interviewer may strongly 
sense the conflict between his expectations and what the agency г©- 
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quires of him. However, it is also clear that such rules would ло? pre- 
clude the operation of expectations. Reference to the Chapter П ma- 
terials again reveals that under conditions of stress, or difficulty in the 
interview situation, the rules may be consciously flouted. Moreover, 
only brief thought is needed to realize that the interview situation is not 
that rigid. There are various choices left to the interviewer. He can 
continue to probe, or he can accept the answer already given. He can 
ask the next question, or he may assume that he already knows the an- 
Swer and that the question is "therefore redundant? In addition, the 
interviewer must apply his judgment in coding an equivocal answer into 
опе of a limited number of prepared answer boxes, and even the most 
rigid rule to record answers “verbatim” allows the interviewer to omit 
Irelevancies without defining what an irrelevancy is. At all these points 
of choice, the interviewer may well let his expectations be his guide. 
The interview. situation might be characterized then as one with 
Some control over the interviewer's expectations. Within these controls, 
however, there is still some realm of freedom, and the controls may be 
'Snored under particular conditions of stress. 
Thus, we would anticipate that expectation eff 
эдш magnitude over the general run of data, і 
eile magnitude in the particular instances where both situation 
x and freedom of choice were great. | 
Senn опы complexity in the operation of sich expectations ops 
acting ata ought to be considered. Whether the basic — e : 
Tolea S-structure expectation predicated upon the early answ | 
Xpectation predicated upon ап initial judgment of the respondent's 
рор membership, it might actually be contradicted by evidence in 
а Course of the rest of the interview. Humans are not so simple and 
‘sistent! Such contradictions might shatter an original expectation. 


Oncei S à i ) 
Dceivably the interviewer might then abandon all such tendencies 
5 


and " b а " 
„С treat each response segmentally. While this is not bey ond possi 
1 hat such contradictions, 


eines appears to be much more likely is Я ae eren cian 
Or an Ks would produce some reorganization o m ү елын, 
ег? іб ternative expectation which would then govern the 
över ent behavior. This at least would attenuate ponsin ei 
thou А агре battery of questions spread throughout ай — Ь 
expect it would not reduce the total occurrence of errors arising from 
ational processes per se. The tendency for reorganization rather 


З т 
1 Complete fragmentation of all expectations would seem supported 


b 
Y the extensive literature previously cited on the primacy of organiza- 


ton i + " M 
n 5 i | 
Perception. Incidentally, such processes, 1t will be seen, make it 


ects would be mod- 
but might reach ex- 


al diffi- 
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difficult for us to measure the full extent of expectational effects by 
quantitative laboratory experiments, since a particular instance of bias- 
ing behavior on the part of the interviewer may not correspond with 
a basic expectation that we have experimentally created or measured, 
and would therefore be regarded as negative evidence. Yet, this be- 
havior may well represent an error related to a more subtle or idiosyn- 
cratic expectation, emerging in the course of the experiment, which we 
are not aware of. Consequently, much experimental data will give a 
conservative picture of the total biasing consequences of expectational 
processes, and it would only be through extensive phenomenological 
data that one could evaluate the full effects of expectations. That such 
perceptual reorganizations occur in the course of interviewing, each 
one in turn producing expectational effects on the data, is clear from 
the findings of a study where the total interview process was brought 
under observation through covert electrical recording of the interview. 
The study will be reported in detail in Chapter V. From the examina- 
tion of the transcription and the returned schedule, it was possible to 
score the occurrence of “biasing” errors on questions of prejudice to- 
ward Negroes and Jews. These were errors which led to a spurious 
measurement of the respondent’s real attitude, through distorting the 
direction of the attitude toward the more or less favorable end of the 
dimension. The analysts noted that, while such errors did occur, the 
direction of the effect was not consistent over the series of related 
questions. After examining the recording for the interplay between in- 
terviewer and respondent, they remark: “As far as direction of biasing 
behavior was concerned, the interviewer very often took his cue from 
the respondent, and then in turn exerted some influence upon the re- 
spondent, in a sort of spiralling process.” 

They also remark: “We were not able to develop a measure of bias 
based on the material in the recorded interviews which clearly revealed 
the operation of any of the interviewers own prejudices.’ 

In other words, the interviewer exerted some biasing effect on the 
measurement of prejudiced attitudes, but this did not stem from his own 
ideology nor from a rigid initial expectation. The behavior seemed 
clearly governed by an attitude-structure expectation, but one which 
emerged and developed in relation to the sentiments progressively €x- 
pressed in the course of the interview. 
| Such considerations of the reorganization of expectational processes 
in relation to the play of experiences upon the interviewer emphasize 


again the role of situational determinants of interviewer effects, which 
will be treated fully in Chapter V. 
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While some reorganization of expectations is likely to occur, it is also 
quite likely that initial expectations can at times be rigid and main- 
tained in the face of contradictory experience. While the ratio of rigid 
€xpectational effects to fluid or reorganized expectational effects cannot 
be exactly specified, no doubt both phenomena operate in some degree. 

For example, the occurrence of both types of expectational processes 
and, incidentally, their strong influence upon interviewer behavior can 
be noted in Oldfield’s study of the personnel interview." His report 
Shows vividly the existence of initial expectations: 


, As to the forms which the first impression may take, my inquiries among 
interviewers have indicated, as might have been expected, that these are 
varied . . . We may distinguish the following. . . . an immediate feeling 
of like or dislike and connected with this, a tendency for the formation of 
Spontaneous judgments of a quasi-ethical character regarding the candidate’s 
Personality. . . . Judgments of a predictive character relating to the candi- 
date's future either in general or in a restricted sphere. Such judgments are 
of the form “he will never get on in the world,” or “she will make a good 
shorthand typist.". . . Lastly, but from the standpoint of the conduct of 
the interview perhaps of the greatest importance, is a sense of knowing how 
to deal with the candidate,—of perceiving the proper attitude to adopt to- 
wards him. і 


But later on, he implies that such expectations also emerged in the 
Course of interviewing and may go through reorganization: 


Another important feature of the conscious processes is the tendency for 
БОГ Or less clearly formulated judgments about the candidate to emerge. 

Very now and then the process of observation is broken into, and a judg- 
Ment is either deliberately made or involuntarily alters consciousness. The 
emergence of these judgments often appears to arise from the суна ани 
pi. attitude toward the candidate. What has been vaguely felt abona : гн 
i idate may become more or less explicitly formulate . Now, А , я 
eve, the constant play of such attitudes which are intrinsically judgmenta 


i ntal 
and haracter, that determines the interviewer's conduct of the conversation; 


Gnd) » in this sense that observation and a growing apprehension of the 
€ regulate the steps the interviewer takes. 
э. ун the problem is not so indeterminate as it ee E 
e. е strength of rigid initial expectations VS. uid" exp es 
€ specified to some extent, as well as the determinants of these 
Strengths, 
The overriding influence of the initial expectation cannot be denied. 
€ evidence provides ample support for this view. The phenomeno- 
gical data of Chapter II suggests how compelling in character initial 
ante ctations are. Asch’s study, previously cited, shows the influence of 
nitial impression in organizing subsequent fragmentary information 
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about a person. A study by Kelley confirms Asch's basic finding. Here 
the conditions had greater similitude to the real-life interview оа 
since the findings were obtained for subjects observing a real ue a 
person rather than for subjects reacting to a mere list of adjectives a 
tributed to a person. 

Prior ёре аран were established by instructions in fifty-five = 
students that a person who would come to teach them in class hada ri 
tain characteristic. The expectation that the “teacher” was “rather на 
or "very warm" was randomly applied among the students who w rx 
required to write a frce essay-type characterization after they had ob- 
served the “teacher” and to rate him on a series of traits. It was found, as 
with Asch, that the initial trait, in this instance warm vs. cold, organized 
and affected the general judgment and reaction to the other person im 
even affected the students’ behavior. For example, students a 
more good qualities to the teacher when the prior expectation 0 
“warm” was provided. ' 

Another extension of Asch's basic work, but one with almost direct 
relevance to role expectations, was conducted by Haire and Grunes. 
The basic finding shows the strength of an initial expectation in the 
face of contradictory information. A list of adjectives containing the 
word “intelligent” was presented to students at the University of Cali- 
fornia. As with Asch, the subjects were asked to describe the individual 
who was characterized by the items listed. What makes the experiment 
peculiarly relevant to role expectation is that the students were !n- 


structed that the individual in question was а “workingman.” The find- 
ings demonstrated that fragmentary 


items are reacted to in an organized 
fashion, in that, as with Asch’s sub 


jects, the students were able to give 4 
coherent description. More important to the present discussion was the 
fact that the initial instructions that this was а “workingman” operated 
to prevent the incorporation of the quality of “intelligence” into the 
description, since these students had a clear and well-organized picture 
of a “worker” into which intelligence did not fit. 

While the detailed findings will be cited later, about 60 per cent of 


ы . tat : T P 
the students in some manner distorted the characteristic “intelligence 


in their descriptions, An extreme instance of this phenomenon was the 
remark of a student that ‘ 


‘intelligence was not notable even though it 1$ 
stated,” 


A much lar 
initial expecta 
influe 


ger literature gives general support to the influence of an 


tion upon subsequent behavior, While studies showing the 
nee of an initial expectation upon subsequent perception of 47- 
other person are few in number, 


а much larger literature gives support 
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to the general influences of initial expectations upon subsequent judg- 
ment of various discrete stimuli. The studies are too voluminous to be 
cited, but the effect of imputing some authorship of a given type in al- 
tering the meaning and consequent evaluation of a text as in “prestige 
Suggestion” experiments, and the effect of some initial uni-directional 
context in altering later judgments, have all been well established.” 
One such set of experiments may be cited for their dramatic demon- 
Stration of the way in which initial stimulation somehow established an 
€xpectation which altered subsequent auditory perception. These are 
chosen for their parallelism to the experience of conversation in an in- 
terview. Twenty-five years ago, Marbe reported a number of studies 
conducted by his assistant, Schorn." In these studies, the expectation 
Was produced partly by experimental instructions and partly by the 
initial direction intrinsic in the material, as in the later experiments by 
Asch, In one of these experiments, twenty subjects were read a list of 
eight verbs in very quick tempo, and by instructions the set was estab- 
lished that these would all express movement. The fifth verb in the se- 
quence however was "sehen" (sec). When the subjects were asked to 
Teproduce the words, seven did not mention sehen, and an additional 
Seven substituted “gehen.” In another experiment of parallel design, the 
twenty subjects were instructed that the words would be expressive of 
p та ог fear. The word that was out of context was “beten” (to pray). 
Vas omitted by seven of the twenty subjects, and five others substi- 
tuted “beben” (to shiver). In a third parallel experiment, the set was es- 
tablished that the words would all relate to a mental process. The word 
Senken" (to sink) was out of context and was omitted by half of the 
jm An additional five subjects substituted *denken" (to е: 
а final experi olitical text over a loud- 
eee, ee elt дйн р 
told that the tex Е “Socialist” newspaper. In reproduc- 
ing che the text was taken from a "Socialist : - рар Ер чүй 
assen te Tim three of the subjects epa = 2 а = is 
осы» onarchie” (We permit the опагсћу), i я 
Зарлы шы a oe ire m E aint original 
text Ьир, еге reproduced which ha not e Fiance cfi 
it which were harmonious with the pattern of Social D y 
hile none of these studies approximate the flux of experiences over 
шу ы duration of a live interview with consequent ‘engl ae 
аре Ог reorganization of perception, they do show t a d A re ‘ 
Suppo of experience is altered by the initial expectation. т all give 
ee to the hypothesis that subsequent experiences, even if contra- 
У, will be assimilated into the framework of the initial expecta- 
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tion. In the place of the experimentally created expectations, we merely 
substitute the natural ones in the minds of our interviewers.” 

Such experimental findings on the potency of the initial expectations 
take on plausibility when one notes the variety of dynamic processes 
which the interviewer has at his disposal in resolving apparent contra- 
dictions. Some of these were revealed in the phenomenological accounts 
presented earlier. For example, Interviewer “B” was aware of the con” 
tradictions in the reports of the simulated respondent but rationalized 
the contradiction as being not the genuine attitude of the respondent. 
Haire and Grunes, in a refined analysis of their data, report a number of 
dynamisms by which the initial organization is protected from the con- 
tradiction. Thus five out of the total forty-three subjects had no dif- 
ficulty in denying the reality of the trait “intelligent” in the working- 
man. For example, one subject remarked “he is intelligent but not (00 
much so since he works in a factory.”** A much more frequent defense 
involved the incorporation of the item “intelligent” with a weakening 
of its significance by the process of encapsulating it in the description 1n 
such manner that its full meaning was distorted. 

We may well consider certain other features of such expectational 
processes which would reduce the biasing influence of expectations 
early in the interview. While early expectations would have consider- 
able effect on subsequent data in the interview, and emerging or re- 
organized expectations would bias the end portions of the interview 
data, we should expect some degree of specificity in the expectations, 
which would attenuate any global effects on the entire interview. While 
interviewers generally would expect a certain structure of congruent 
attitudes or a pattern of attitudes correlative with some group member- 
ship, it is unlikely that they would predict on this basis the answer (0 
every one of the questions. While Ichheiser comments on the “tendency 
to overestimate the unity of personality"? we may conjecture that 
most humans do not see others as operating with a Weltanscbauung—? 
totally unified body of sentiments. While system and order would be 
expected, it would probably be of the nature of several subsystems of 
attitudes, each expected to be orderly but separate. Similarly, interview- 
ers might expect a man to have a certain series of attitudes which dif- 
fered from a woman’s attitudes, but they would probably not regard 
such role determination as encompassing every realm of attitude. 

Therefore, an initial expectation would generally bias the interview- 
er’s behavior with respect to three or four subsequent questions which 


he believed to be relevant or related to the initially expected structure 
and not bias the rest of the questions.'^ 
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The experiment by Kelley, cited earlier, illustrates this specificity. 
Detailed data show that the prior expectation of a “warm” or “cold” 
person did not affect the ratings of all the characteristics of the teacher. 
The effects were differential depending on the degree to which the 
warm-cold variable was regarded as relevant to the characteristics rated. 
Kelley is suggesting that the forces deriving from an initial expectation 
are constrained in their effects on subsequent data by a kind of logic of 
relevance,” ; 

_ Other detailed findings by Kelley suggest that prepared role expecta- 
tions or probability expectations prior to the onset of the interview 
would be attenuated to some extent in given interviews by the evidence 
that a particular respondent does not fit the prepared categories. Pre- 
Sumably this finding would not bear upon the influence of attitude- 
structure expectations which, by definition, emerge only following con- 
tact with the given respondent. Several different accomplices were used 
as the “teacher” who appeared before the classes. The influence of the 
expectation “warm-cold” was not uniform in magnitude for all such 

teachers.” Kelley is again suggesting some limitations upon the effect 
of certain early expectations upon subsequent interview data 

Just as tentative expectations prior to the onset of the interview might 
be dissipated with certain respondents who do not fit the mold, so, t00, 
a аже that given tegpondent e corsa A ja 
s n expectation because of their charac g , 
pondent might either appear to typify a certain role and thus accen 
Fa. role expectations, or might be regarded as having oppe 

Y organized attitudes, and thus accentuate the operation ot 2 
tude-structure expectations. A suggestive demonstration of this latter 
Рае is available in the study by Frenkel-Brunswik, cited in 
dg 12% As previously indicated, three judges following prolonged 

Tvation, rated groups of boys and girls on the strength e 
ie drives—e.g., drive for autonomy (a striving for indepen - 
x and freedom), drive for aggression, etc. It was noted кага = 
Fai pes analyzed the agreement between оша те hgp 
analys; n the specific drives. What concerns ' 

ysis Brunswik made of the tendency of the ju 


9f dri LL : ; ing the ratings 
tives co-existing in the children. By intercorrelating gs 


She Could d Я i rded children 
T hether udges rega 
Who had a Ва for pe а ы docs little need for “so- 


strong need for autonomy 3$ 
While these intercorrelations, of € 


Min : 
m. by the fact that there are truly interre t ў 
ivational processes, it will be seen shortly that the single ratings and 


Cia] ties.” 
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the relations between ratings reflect the biases of the ip cmi жыр 
Consequently, the intercorrelations implicitly bear upon the problem d 
attitude-structure expectations, since they establish what contents a 
regarded by the judge as forming a common structure. — 
Brunswik noted the rather interesting finding among all the ju e 
that their ratings of the drives were more highly intercorrelated ce 
female subjects than for the male. While it is not b p possi 1 T 
that the organization of drives is less specific in women, there seems 2 
be no real evidence in support of this. It seems more likely that E 
judges were simply inclined to the belief that the structure af pune 
in women is more comprehensively organized. For Frenkel-Brunsv i 2 
judges, who incidentally were women, the old saying that ен куе 
fickle” may not be accepted. By extension, it is suggested thige in Е 
viewers might be more prone to exercise ап attitude-structure expec я 
tion when interviewing one type of respondent rather than another, à 
the basis of strong beliefs as to the relative consistency or unity of give 
kinds of people. | l ie 
Such phenomena as the facts that expectations will generally not s i 
sume all the possible contents covered by the total questionnaire ed 
that prior expectations will not be applied routinely to all the responi 
ents tend to reduce the massiveness of the bias produced. The ee 
would be maximal only for those interviewers whose expectations ten 
to be comprebensive in Scope and rigid or persistent in the face of thg 
contradictory appearance and remarks of respondents. That there «a 
variations among interviewers in these respects is supported by the qua 
itative data presented in Chapter II and the statistical data therein pre- 
sented showing the distribution of expectations among the current 
NORC field staff. We are not concerned here with the problem of the 
determinants of such individual differences or their relevance to the 
control of error through selection methods. These matters will be dealt 
with elsewhere. What is clear is that there is some reduction of the se- 
tious biasing effects, since not all our interviewers have extreme tenden- 
cies. Some minority of them even seem free of expectations about their 
respondents." Others seem to show Strong expectations, but among 
these, the expectations may not be comprehensive in scope. However; 
that there would remain some small number of individuals who would 
have beliefs calculated to produce expectancies over a wide range of 
Characteristics is suggested by another finding of Frenkel-Brunswik s. 
She intercorrelated the nine Sets of drive-ratings assigned the subjects 
for each of her three judges separately. Apart from any question of var- 
iation in the relationship between a particular pair of drives, she noted 


Sources of Effect Deriving from the Interviewer 93 


that the judges varied strikingly in the formal tendency to regard any 
Possible pairs among the nine ‘drives as falling into the same clusters. 
Thus, out of seventy-two opportunities to find pairs of drives exhibit- 
Ing a common pattern,” Judge “Н” found twenty-five such instances, 
whereas Judge “F” found only seventeen and Judge “G” only twelve. 
In other words, judges or raters or interviewers seem to vary in the 
mere tendency to expect narrow or comprehensively organized struc- 
tures, and with some, there is a considerable approximation toward a be- 
lief in a simple unitary structure. 
‚ One demonstration of such a belief in the unity of a subject’s behav- 
lor, and in this instance its pervasiveness, is available in a study by 
Elkin.” А life-history document was circulated to thirty-nine judges, 
who were asked to make certain interpretations of the case. The judges 
Tepresented such a diversity of backgrounds as psychiatry, anthropol- 
Обу, social work, sociology, and psychology, as well as the laity. Within 
the academic disciplines, there was further variety, since the psycholo- 
81515 included both experimentalists and clinicians, and the sociologists 
oth theorists and "objective researchers." While differences of inter- 
Pretation occurred in practically every area, there was consensus on the 
Опе point that the subject had developed gradually and consistently. 
he judges, in other words, did not acknowledge incongruity. 
Another consideration of importance with respect to the biasing con- 
Sequences of such expectations is their contents. An entire staff of inter- 
Viewers might conceivably entertain expectations, but the specific atti- 
tude that was regarded as the accompaniment of lower-class status or 
the accompaniment of an initial attitude of atheism or the majority po- 
Sition in the population might vary from interviewer to interviewer. 
Y contrast, all interviewers might agree as to the attitudes that accom- 
Pany a given class position. 


he Бе: n T T WT » 
; aring of these respective distributio 
v SP ee es 

lewer expectations on survey results (their biasing effects) is difficult 


wehematize, Ultimately, one would have to explore such iae en С 
er uni-variate and/or bi-variate characteristics are pum ај ic : 
thar кресгацопв of homogeneous or heterogeneous ae D - 
ast 115 question of the distribution of the contents of exp 
e 15 of great importance. 
Dcidentally, it should be noted t 


ns of the contents of inter- 


hat variations in the contents of ex- 


pectations among interviewers make it difficult to gauge the idend 
8 effects of expectations in purely quantitative laboratory ехреп- 
aie For example, if a given initial expectation is created experimen- 
Y and we observe the interviewer's behavior on a simulated question 
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or answer, it may appear to us that the attitude recorded is not congru- 
ent with the expectation. However, for that interviewer the attitude 
elicited might be a legitimate part of the over-all structure. Thus, ability 
to obtain an apparently inconsistent answer might logically not deny 
our theory, and the finding would be only a pseudo-negative one. Е 

While laboratory experiments of the usual design may be insensitive 
to variations in the contents of expectations among interviewers, nat- 
ural-like field experiments to measure expectational effects are likely 
to be insensitive to universally held expectations. In the field study, the 
usual procedure would be to compare the results for interviewers inter- 
viewing equivalent groups, and to correlate these variations in results 
with some measure of expectational tendencies obtained for cach inter- 
viewer. It will usually not be possible to measure the effect of a uni- 
versally held expectation, because one cannot gauge a change in the 
survey result (the dependent variable) except by the standard of an- 
other interviewer’s work. (In the laboratory experiment, since one, by 
definition, has a criterion of what the answer ought to be, one can meas- 
ure change whether it is differential or universal.) Thus, it is likely that 
either type of experiment will understate the total effects of expecta- 
tional processes, the extent of this understatement being a function of 
the relative proportion of expectations with universal or differential 
contents, Such methodological considerations again emphasize the im- 
portance of inquiry into the contents of expectations, and their distribu- 
tion. 

That peculiar idiosyncratic definitions of the contents of given struc- 
tures of behavior occur is beyond doubt. From one item of behavior, 
the most varied expectations or inferences can be drawn as to its mean- 
ing or correlates or what structure accompanies it. In the RAF study 
previously cited, on reliability of assessment of pilots, the two psychia- 
trists prepared introspective reports of their methods. Examination of 
these reports indicated the operation of attitude-structure expectations 
as a guide to the diagnostic process. The writers conclude that the: 

two observers... h 


tain combinations of th 
out being fully awar 


ave been guided in making their assessments by cer- 
e traits listed, and that they have been so guided with- 
j € of the process. These combinations of traits seem tO 
have provided the observers with an indicator in selecting what is significant 
from a very large number of variable factors. That such indicators form the 
basis of the clinical method of diagnosis is evident in the definition of syn- 


dromes in terms of objective phenomena.?? 
The detailed anal 


| lysis of the intercorrelations between single traits at- 
tributed to the pilo 


ts by each of the two psychiatrists shows that there 
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= сенд in the way the traits are combined into constellations or 
етта ie regarded as forming a common structure. The two psy- 
= ы Е ing with equivalent samples, obtained different degrees 
a FE or various combinations of traits. For example, apart 
ш. at еу differed in the frequency with which they ob- 
Bai за) d z P obias, they differed in the correlative symptoms 
adi shown below in Table 9 which is constructed from data 
ted in the original report." 


TABLE 9 


Dirre 
iren ai BETWEEN INTERVIEWERS IN THE CONTENTS OF А 
crure EXPECTATION AS REVEALED BY THE INTERRELATIONS 
FOR PSYCHIATRIC SYMPTOMS 


N ATTITUDE- 
OBTAINED 


Awoxc PILOTS UNDER TRAINING 
DiacwosED as Havinc PHoBIAS 
Prorortion Ѕноміхс GIVEN 
OTHER SYMPTOMS FOR* 


Symptom 
Psychiatrist Psychiatrist 
1 2 
ie ОТТЕ 14% 54% 
Oh obsessional tendencies. . 6 31 
Sessional personality 2 2 
5 2 


Anxi В 
nxiety and obsessional temperament. - - - - 


Р 
The bases for the percentages were 66 for Psychiatrist 1 and 122 for Psychiatrist 2. 


I 
П the study by Frenkel-Brunswik already alluded to, a series of find- 


ings in B 
crease our knowledge of individual differences in the contents of 
ed, she intercorre- 


ng the nine drives, 
binations existed. 
egative intercorrelations 


Bard 
€d those two drives as highly related and compatible. 


к pes words, judges disagreed markedly as to whether a child who 
h or low in another respect. For ex- 


» in fourteen instances the sign of the intercorrelation between 


Sf sent drives was reversed between Judges “F” and *G" out of a total 
Marked cae possible comparisons. This suggests that there are 
are ded: ual or interviewer differences in the components that 

as contained within a given structure, or that the meaning 
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of a given entity, in terms of what larger structure it belongs to, shows 
marked interviewer variation. | 
Brunswik, by inspecting the differences among judges in the т 
lationships between drives, also notes that disagreement was locate 
mainly in certain drives. Thus, there was great variation among the 
judges in the degree to which they regarded the drive “autonomy 
compatible with other drives, but there was marked agreement on the 
entities that accompany the presence of “aggression.” Thus, there ap- 
pear to be for certain phenomena, constant or universal attitude-struc- 
ture expectations, perhaps legitimate, whereas for other phenomena the 
expectations as to what components belong to the structure are not 50 
clearly defined and may even be idiosyncratic from interviewer to in- 
terviewer. К 
The material in Chapter II suggests that the contents of — и 
would tend to be uniform when they involve highly institutionalize 
patterns or regularities, or at least highly institutionalized beliefs. M 
we cited as relevant to role-expectational processes, the frequency О 
belief among interviewers in the 1948 Elmira study that given economic 
groups would vote for a certain party. It was noted for each of аар 
economic groups studied that а zajority of the interviewers Бена 
that the group would vote іп a certain direction, For the group, © 
people” the value was a maximum, with 76 per cent of the staff believ- 
ing that rich would vote Republican.?^ This suggests that with respect 
to very well-established and prominent phenomena, the expectations 
would approximate to uniform contents. . 
One demonstration of uniformity in the content of expectations in an 
institutionalized area is available in the work of the Census Bureau in la- 
bor-force measurement." The demonstration, incidentally, reveals the 
significance of role expectations in causing error in factual as well as 10 
opinion surveys. Accumulated experience with the Monthly Report on 
the Labor Force up to about 1945 had revealed that these surveys were 
failing to classify a considerable number of people as employed or in the 
labor force who should have been so classified according to definitions 
prescribed in the studies. The magnitude of underenumeration of work- 
ers in the MRLF prior to 1942 was of such order that a change in the 
procedure increased the estimate of employment by about one million, 
this increase coming mainly from people formerly classified as students 
or housewives. Another experiment revealed that about one and one-half 
million people engaged in unpaid farm work, each of whom contributed 
a substantial amount (nineteen or more hours) of work per week, had 
been previously recorded in the MRLF as nonworkers. Similar errors 
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"m peres ие been prevalent in the classification of people in the 
vie о һе eoe were of such considerable magnitude that it 
зна aooe оп i e basis of experimental work that approximately one 
= owa ES Ww Б classified in the decennial census as engaged in 
ожа of un не рагай wha were actually doing a substantial 
ааа paid work in agriculture. In discussing these errors, Ducoff 
good remark that one explanation may be that: 
or will not ask specified ques- 
licable. It is quite possible that 
numerator might auto- 
sework" without being 
It seems likely that in 


there j : m 
one б fous a possibility that an enumerat 
Sumac elieves them unnecessary or іпарр. 
matically pit cur from her housework by ane 
asked if she classified as “engaged in own home hou 
many cases M ne at work on a job that week. . - - 
Classification fc er the enumerator or respondent assumed that the proper 

ome Бойс" married woman who kept house was engaged in own 

Шаг mis-cl. 2 M ner of whether she was employed full or part time. 
school y assifications of persons who were working and also attending 

ndoubtedly occurred. 


Whi : | : | 
i ile the concept is never explicitly employed in these discussions, 
у involved as a 


it i 

ridi: that a "sex-linked" role expectation was clearly involves 
oregoin error. The magnitude of the effects on the data, as cited in the 

Woilkiug жаш crs the inference that role expectations about the non- 
the fielq aa of women must have been rather widely spread trone 

the total нне а = enumerator interviews а Very small proportion o 
Must have in c; it therefore seems unquestionable that the expectation 
Merators in "; char: acteristic of a considerable proportion of the enu- 
Sested that = Sb ED bias estimates bya million or more. Agam it eco 
catures o expectancies having to do with highly stable or institutiona 

S of the society will approximate most to uniformity 1n content. 
niformity is not to be ex- 


ow 5 
Pected [ even in such realms, thorough Ч 
: For example, the data, to be discussed shortly, from our field 


experi 
i їп . B . " 
nent on role-expectation effects provide inferential evidence that 


inter; 

1 1 . 

ewers differed markedly in their beliefs as to the patterns of 
c no realm could be 


Shops: 
Ppin : j 
Ping behavior of men and women. Certainly, 


much 

economy ч institutionalized than that of the roles of the sexes in the 
Perien ny of the household. Yet, through the idiosyncrasies of the ex- 
" d in this respect. 


A Sur interviewers, they even differe E cam 

Permitted g nstance of objectively well-defined structures whic sti 

€ in an ome play for expectations with idiosyncratic contents Is avail- 
experiment, to be cited shortly, on the biasing effects of atti- 


tud 
€-str " 2 à 
two apris expectations. As will be explained, interviewers heard 
the Mh ated interviews, one picturing an “isolationist” respondent, 
r i а 
Picturing an “interventionist” respondent. Both of these char- 
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acterizations were vivid, fairly extreme in content, and highly consistent 
with the exception of occasional responses. Given the fund of experi- 
ence with this well-known typology, and the sharpness of the two il- 
lustrations of it, one would expect thorough uniformity in the percep- 
tion of the respondents. While this was the finding in general, one notes 
that a small number of deviant interviewers were so perverse in their 
beliefs that they appraised the isolationist attitude structure as interven- 
tionist. The detailed data are presented in Table 10 below.” 


TABLE 10 
VARIATIONS iN (IwrERvIEwERS) APPRAISALS or Two RESPONDENTS 
PERCENTAGE or Interviewers 
Arenaen as Isolationist Interventionist 
Characterization Characterization 
Strongly interventionist............. 1 52 
1пгегуепїопї$ї..................... 1 40 
11 8 
58 == 
29 = 
100 100 
(N = 114) (N = 114) 


As previously noted, errors arising from attitude-structure expecta- 
tions or role expectations will affect the values of bi-variate characteris- 
tics—i.e., relations between different characteristics—by inflating or ob- 
scuring the true value. Since much opinion research concerns itself with 
refined cross-tabulations or with problems of an explanatory nature 
rather than with marginals or problems of sheer description, errors aris- 
ing from expectational processes assume great significance. : 

A final theoretical issue with respect to the nature of such expectation 
effects is the proper evaluation of them. We may well demonstrate that 
such expectations exist, and that they affect the answers recorded for 
the respondents. Whether these alterations of the answers reduce the 
accuracy of survey measurements is another and much more fundamen- 
tal question, since there is no assurance that what the respondent says 
in the first place is true. 

The thesis could easily be advanced that such expectations on the part 
of the sensitive interviewer lead him closer to the truth than the mere 
verbal report of the respondent, and that they should be permitted to 
operate freely. An influential body of opinion would argue that an in- 
dividual's attitudes are organized, and that the structure apprehended 
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might represent the truth rather than the discrete report. Such opinion 
might further claim that the respondent engages in self-deception or de- 
liberate deception or that he gives a casual answer rather than his con- 
Yiction or that the discrete report only takes on meaning in the light of 
105 setting with other opinions. This view would regard as perversity 
е acceptance of the respondent's report as valid instead of the report 
4S Interpreted by the sensitive observer. 
, Even if one were to grant this view, evidence has been presented that 
Interviewers vary in the tendencies to expectations as such and in the 
Contents they ascribe to given structures. Consequently, while one or 
another interviewer may apprehend the truth, the operation of such ex- 
Pectations over the entire field staff will reduce the reliability of various 
niece However, it is our thesis that such expectations blind the given 
Srviewer to the full complexities and realities of the attitudes he is 
x мар. to elicit and record, and therefore, reduce the validity of а 
for d Empirical data to be presented below will provide some Mic 
th 1С argument, but logical considerations provide strong supp! : 
. € View that the operation of such expectations 1s not the best means o 


increasing validity of survey data. . 
he might well admit that the answers of respondents in surveys 
que ги invalid, yet urge that measures taken to assess and d 
ak, Validity be introduced on а systematic basis, by checks intro кс 
ine or by instituting new modes of questioning, intervie 9, 
the like. If the interviewer is left to his own devices to check upon 
: validity of the results, there is no way of distinguishing original data 
Fan interpreted data, and checks and corrections might be duplicated. 
Ven the present assumption of public opinion research, namely, that 


th id 
© recorded answer is a faithful account of what the respondent said, 


nat 1 . - . ns 
her than an interpretation, the danger of allowing such expectatio 
lone in the errors perpe- 


to di. 
Stort the г , ks lies not a 
i he respondent's remarks IS GA i 
ated, but in the fact that we do not know which is interpretation an 


Which is verbal report. 


2. EXPERIMENTATION ON EXPECTATION EFFECIS 


Yes test whether or not there actually was an gmi ES a 
eri ettitude-structure expectations, à modified form o! tions, a de 
у. "rai Was used.?? By means of phonograph transcrip " : ed 
со, heard two typical, yet markedly lee мыс be 

Consist E In a situation as closely resembling an interview as v 
nu experimental design. 
€se respondents had given № 
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replies to establish their general sentiments clearly (and thus permit sub- 
jects to form attitude-structure expectations), test responses were 1п- 
serted at intervals in the course of the interviews. These test responses 
took the form either of lukewarm or equivocal responses that were the 
same in both interviews or of responses that were inconsistent with the 
attitude structure of the respondent. From the way subjects recorded or 
coded the discrete but equivalent responses they heard in the two inter- 
views, it could be determined whether or not the two sets of attitude- 
structure expectations had an effect upon the results. 

The experiment utilized a questionnaire of the type frequently used 
in opinion surveys. The questionnaire contained a majority of pre- 
code-type questions, but also a few free-answer questions. With this 
questionnaire as a guide, two dummy interview scripts were written. 
From these, phonograph transcriptions were made with a professional 
actor and an NORC staff member playing the roles of respondent and 
interviewer respectively." The respondent heard on the first transcrip- 
tion was an isolationist, provincial, and prejudiced respondent. The re- 
spondent heard on the second transcription was a thoughtful, well-read 
interventionist. These two types were chosen because of the striking 
contrasts which it was possible to portray, because question and an- 
swer material for such characters was readily available, and because the 
types were so familiar to most interviewers, as well as laymen, that they 
would have verisimilitude. ` 

One other reason for the choice of these two types was prominent. 
Limited funds prohibited testing out the types and empirically deter- 
mining for the experimental subjects what specific attitudes did not fit 
with the over-all type and, when necessary, dubbing new material into 
the record. In the absence of such ideal circumstances, types had to be 
chosen for which a “good guess" could be made as to the discrete atti- 
tudes that would be regarded as contributing to or as inconsistent with 
the over-all picture. It was assumed that not too much error would oc- 
cur in identifying our conception of the isolationist or interventionist 
with the interviewers conception of these types. In so far as our con- 
ception was wrong, the script would not contribute to the over-all pic- 
ture intended, and the findings would not be a crucial test of the hy- 
pothesis. More than this, as previously suggested, what was regarded as 
an inconsistent item by из might on occasion have been accepted by the 
subject as a legitimate content of the over-all structure of attitudes. In 
such instances, accuracy in recording a so-called inconsistent answer 
would logically not have denied the hypothesis at all, but the finding 
would appear to be negative evidence. The comments of several sub- 
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Mes E suggest that their accurate recording of an "inconsist- 
VITE dicum mere у represented the fact that they regarded this an- 
ене sistent with the whole, and in this sense the findings to be 
a ie are a conservative test of the hypothesis that recording would 

T ased in the direction of the expectation. 
a eri me of the two respondents might be regarded as 
perceive, d нт but this was necessary to insure that the interviewers 
lines Ima the character as intended, otherwise negative results would 
arose fr n indeterminate. They might have meant either that no biases 
No test. oo such perceptual processes or that the experiment provided 
E maar no expectations had been established. In order for the ex- 
to ma "s "n itself to an unequivocal interpretation, it was necessary 
magnitu de. Py pictures presented. While this might accentuate ne 
Cross-secti : the biases observed as compared with normal nationa 

tie ps which do include some humans so vague in outline кп 
known кз AdEaGter whatsoever, the reality of these extreme types is kis 
ratings " all in public opinion research. Moreover, as 15 clear from t d 
Pra e experimental subjects gave to the respondents, prene 
in this Е he intended characterization was even missed on occasion, an 

vs iie the over-all results are again conservative. " 
ticeable ie that the effect of expectations would be especia el 
Plies, Fo if at all, in the subjects’ handling of lukewarm or equivoca re- 
ent wh on the one hand, it is evident that if a response were perra 
€xpectati attitude-structure expectations there could be no seg e 
iability Ек since expectanions would im pic oem 
FN А $ 
Were cg inrerviewers ending o cte expen Фе 
Chances 4 y inconsistent with attitu e-str e EE as 
structur re that the interviewer's Image of the respondent 
With jt 5 wae itself have to be revised, and the expectations e 
ag, and ut if the response were lukewarm, 1t might wave no such re 
Therefor expectations might have full charge in guiding pep 
Sponses in reliance was placed mainly on lukewarm or equivoca E 
ikewise c. testing the hypothesis, although inconsistent responses We 
e ii ris for this purpose. - 
perimental subjects who listened to the trans 


ront i 
view of them copies of the questionnaires corresponding to 
"э, i ite down or code the answers as they 


© that errors in recording were not due to the artifact of lack 
d d the 


» the interval : 
Sua] vals between question an 
нне ает of delivery of а respondent. While the 

ed exactly and did lead to a few complaints 


criptions had in 
the inter- 


about being hurried, 
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the influence of such a factor upon the results can be questioned on the 
basis of empirical data, presented in the original article, on the lack of 
any relation of clerical errors to expectation effects. The mechanical 
quality of the transcriptions was good, so that inaudibility of the an- 
swers could scarcely have been significant in accounting for error. Data 
collected from the subjects as to difficulties in reception show that these 
were negligible. А 

So that errors could not be due to lack of practice in handling the 
mechanics of interviewing on this survey or to unfamiliarity with the 
rules for handling given questions, the experimental subjects filled out 
one questionnaire ahead of time, recording their own opinions. In addi- 
tion to the practice this task afforded, it provided a measure of the 
subjects’ own ideology, so that the influence of this variable on the 
results could be evaluated jointly with the influence of expectations. At 
the time the subjects recorded their own Opinions, they were given 
written specifications on the purposes of the survey and the procedure 
for handling given kinds of answers, A final briefing period was held at 
the time of the experimental sessions. Just before the transcriptions were 
played, the subjects were given last-minute instructions—a quick review 
of the specifications and particular instructions for the sessions them- 
selves, including a request that they try to imagine that this was an ас- 
tual interview. The subjects were assembled in small groups over 4 
number of different sessions. The order of presentation of the two tran- 
Scriptions was rotated from Session to session so that the influence of 
temporal factors of fatigue or practice was equally operative upon the 
results of each of the two interviews for all subjects taken together, and 
cannot account for the differences in recording of answers. 

After each transcription was played, subjects were given time to fill 
out a so-called “field rating” of the dummy respondent—actually an 
appraisal of relevant characteristics of the respondent, his extent of 107 
terventionism or isolationism, his interest in and level of information 
about international affairs. This enabled us to determine whether the 
subject had actually perceived the over-all characterization intended. In 
addition, subjects were given a form on which to report their personal 
characteristics and their comments about the experiment—whether they 
were able to hear each response, whether they maintained the same im- 
pressions of the respondents throughout each interview (to determine 
whether some of the deviant test responses had caused a re-formation 0 
attitude-structure expectations). 

Some 117 subjects participated in the experimenta] sessions. They in- 
cluded regular public opinion poll interviewers from various co-oper- 
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ating agencies, university graduate and undergraduate students.* About 
а third had no previous professional interviewing experience, although 
they had had related course work in the social sciences. Half had up to 
One year of professional interviewing, and the remainder had experience 
longer than a year. 

The experimental procedure described in the foregoing should have 
Provided a crucial test of the influence of attitude-structure expectations 
upon the results, The hypothesis would seem to be proven if the equiva- 

СПЕ answers inserted into the two transcriptions were coded differently, 
depending upon the context within which they were imbedded. How- 
ever, such a finding might be open to one other explanation. Conceiva- 
Му the different coding of apparently equivalent answers could be due 
ang controlled factors associated with the way in which the сасы! 
migh 7$ Were spoken by the actor respondent. For example, one a 
oe t have been delivered more emphatically or knowingly than the 

er, Furthermore, the answers on both records were not word-for- 
word duplicates, although they were the same in substance. The varia- 
Чоп in the results might be attributed to such factors, intrinsic to the 
VUSWer, rather than to the expectation process operating upon psycho- 
nia equivalent answers. To investigate this possibility, the = 
хы Were taken out of context and placed in random = in - : ч 
toa ner typical answers to the questions. The series was ^ п p E 
Tie rena of judges in both oral (soundscriber ge = — 2 
вено о Vete asked to code these i ied ie uns 8: ice m 

Тот the ; hat had been given to the experin ss gerang ten 
test dis Judging sessions served to tell ат the ioa cai de dom 
text S would be if they were presente out о! сы нса 

TO; ш they thus served as a guide against which to cor р ES 
Midas а experimental sessions. Those test responses а Im 

according to the design by the judges were eliminate 


er analysis, 
ЭГ two of the questions, there was no doubt whatsoever that the 
These were Questions 7 


xd ied responses were identical in content. 
eed E on the questionnaire. Both of these were pese дабан 
Seemed © the interviewer to circle the code on the ques 
Most nearly to fit the respondent’s attitude. — 
-uestion 7 was phrased as follows: "In general, do you think that 
Sane States is now spending too much on our ety анты 
respond, » about the right amount, or not enough? Code = an sna 
ion, th 118 to the alternatives were provided. In response to this q 
€ isolationist said: “All I know is that it’s costing us taxpayers ап 
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awful lot of money. But I suppose you got to feed those starving people 
and I guess you can’t do it for less. Still a lot of that money is just going 
down the drain. Them people ain’t working over there. They don’t ap- 
reciate it.” 

А In response to the same question the interventionist replied: “Well, 
there’s no question but that the economic recovery program is pom. 
this country a good deal of money. Still, I presume we must help W tee 
ern Europe get back on its feet, and I suppose it can’t be done for ec 
Nevertheless there has been a certain amount of mismanagement an 
waste." . 

The judges, in the light of specifications which instructed the inter- 
viewer to ignore any criticisms of the manner in which the money was 
being spent, coded both responses as “about right amount.” The s 
mental subjects, however, hearing these responses in their contexts, 018 
played a strikingly different pattern of recording, as Table 11 — e 
Hearing the isolationist’s reply, 53 per cent of the subjects coded - 
much," while 20 per cent coded “about right amount." On the ota 
hand, hearing the interventionist’s reply, 9 per cent of the same group 
of subjects coded “too much,” and 75 per cent “about right amount. 

It is interesting here to follow the thinking of one of the interviewer- 
subjects, who reported his thoughts during a phenomenological cn 
In speaking of the isolationist's response, this subject said, “Well, he d 
given two answers which I would ask him to clarify. In one case he vim 
"Too much,’ and in another case, ‘About right amount’. . . I get 25 
feeling that this individual really means ‘Too much,’ but I would puo 
with reservations . . . He has said both, but I think ГЇЇ put "Too muc a 
for this individual." Я 

The second crucial question mentioned above, 15Е, was one of a р 
of questions about level of interest in foreign and domestic affairs. i 
was phrased as follows: “How much interest do you take in our pon 
toward Spain—a good deal of interest, some interest, or practica , 
none?" To this the isolationist replied, “It’s the way I told you—l jk 
follow the papers much these days, but I guess you could put me en 
as taking a little bit of interest in that." The interventionist respon r 
with, "Compared with the other areas you've mentioned, I guess 
regard myself as having only a little bit of interest in that." » 

The judges, following specifications, coded both replies as опе, 
As Table 11 indicates, there were 20 per cent of the subjects who code » 
"None" for the isolationist, and only one per cent who coded the inte" 
ventionist's reply this way. 

The differences in the coding of the replies to these questions, then, 
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е аы а to the operation of the two expectation patterns. 
effect wr ees ei t i e condition of equivocal or lukewarm responses—the 
The Meiras e-structure expectations 15 to influence survey findings. 
pen, ean с пагиге of these effects on the results are clearly of two 
rat Gece e marginal distribution on a particular question is dis- 

econd, the intercorrelations between attitudes are affected, since 


Tisi TABLE 11 
HE INFLUEN 
ЕМСЕ or EXPECTATIONS ON THE Соріхс OF SUBSTANTIALLY IDENTICAL 
Responses To Two QUESTIONS 
(in per cent) 


Crassirication Given BY SUBJECTS TO: 


Isolationist Interventionist 


| Respondent Respondent 


Questio ч 
ny. 
Amount spent by U.S. on program for 


European recov 
ТӨН», . 5 vus ESTA fh n ots 
About right amount. . . 
Not enough. ....... 
Don't know and other... . 


Questio, 
n 1$Е ; Р " 
E. Amount of interest in policy toward 


Spain 
76 99 
20 1 
4 == 
100 100 


117 117 


Number of cases. .. 557777 | 


of the attitude-structure ex^ 
d on marginals, and dynamic 
tudes, would both be im- 
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Mterpre ee Thus, estimates predicated o 
Paired § ns based on relations between atti 

m by these effects. 

Some ins data collected in conjunction with 
such ex ed on the fundamental problem pos 
ism peetstions on the validity of survey Te em 
Sürvey i jim of the experimental subjects acted as interview : A 
Survey єў community attitudes in Denver 1n 1949.? In the case of this 
» Since checks on the accuracy of the report on a series of ques- 


tion 

. DS үү ; А 

з ere available in the form of official records on each respondent, 
alidity of the results each 
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15 ро: ` 
ssib 
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interviewer obtained. Since the interviewers received assignments which 
were equivalent, any differences in validity can be assigned to the inter- 
viewer. The systematic relation between the validity of the reports ob- 
tained by different interviewers in this survey, and their tendency to 
introduce expectation effects in the experiment, will provide some an- 
swer to the larger issue of the good or bad consequences of such ex- 
pectations. In Table 12 these findings are presented in the form of 


TABLE 12 


Tue Revation or Expectation-Errect TeNpeNcieEs TO THE VALIDITY OF 
Reports OBTAINED IN THE Course or A FIELD SURVEY 


Prone to Not Prone to 
Expectation Expectation 
Effects Effects 
(n = 22) (n = 17) 


Report of vote in 1948 presidential election 
Interviewers with 


the Јеавеапўа ДИ» cass жиде meus etia dis sony e 8 5 
moderate invalidity . 3 9 
the most invalidity 11 3 


Report of automobile ownership 
Interviewers with 
theleastinvalidity. «uoti cim оша эше pig gs s 7 
Беата Су asa succi eyes ir atr essen 7 


NRO 


the most invalidity 8 
Report of personal contribution to Community Chest 
Interviewers with 
Шева ау sews vacas ues aes ase zh + 5 9 
moderate invalidity . 7 6 
thiemostinyalidiby, ca cca, ааа eo ach ett eins e 10 2 


frequencies. Proneness to expectation effects was measured by the tend- 
ency to distort the handling of Question 7 in the experiment, and the 
relative validity of the interviewer's results was measured by classifyi8 
all interviewers into one of three categories defined by the relative mag- 
nitude of the invalidities obtained. . 

In three instances, those experimental subjects who were expectation" 
prone were more likely to fall into the category of interviewers who 
obtained relatively less valid results. The data reveal this fact by inspec- 
tion, and Chi-squared tests for the three items reveal P-values of .02, 85; 
and .05 respectively. When these values are pooled to get an aggregate 
test, the difference is significant at the .05 level. One might argue that 
the invalid results derived not so much from expectation tendencies but 
from other factors correlated with expectation effects. For example, 
from evidence presented in the original report of this study, it was note 
that those interviewers who are prone to expectation effect differ 1n 
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experie: m " 
statistically odi at clerical tasks, although the differences are not 
гуен А 3 E ай Conceivably, the difference in performance of 
number Hf a via ht oie from such uncontrolled factors. While the 
aad invalidity dies а l, the хар between expectation effects 
experience ер а peed controlling first for length of 
ship piss а еа it erical s kill. In both refined tests, the relation- 
А св gh it is reduced іп magnitude. In this case, at least the 
rein p : 55 seems to produce blindness rather than insight. 
течела ош was devised to determine the biasing effects of 
Was limited to the expectations. Like the previous experiment, this one 
ing in. relation es test of the hypothesis that such expectations, emerg- 
Purely usus n куа of early attitudes, can affect results 
owever, it goes E classification of answers on pre-coded questions 
Conditions ance eyond the first experiment in specifying some of the 
Sixty Ru which expectations operate. 
Sent a sheet m members of the current NORC field staff, were 
question: “Tp taining twenty-five discrete answers to the following 
Ub оь on general, do you feel the United States is now spending 
amount or not our program for European recovery, about the right 
enough?” 
It Should be Би з : В : 
“хб птеп] noted that this question was identical 
al questions used in the Smith-Hyman st 
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structed. These consisted of interview schedules containing cleven ques- 
tions and fabricated responses to each, of which the experimental ques- 
tion with each of the eight responses constituted the sixth question on 
the ballot. The responses to the nonexperimental questions were designed 
to produce in the interviewer's mind a picture of a respondent whose 
general attitudes were in presumed conformity with the preceding i, 
categories—that is—respondents whose answer to the ее 
question might be “too much,” “about right,” or “not enough. In all, 
fourteen different contexts were constructed—two cach for the six ex- 
perimental responses and one each for the control responses. If the split 
between interviewers was—let us say—between “too much” and “about 
right,” then one each of these contexts were constructed for that par- 
ticular response. | 

The questionnaires were then filled in, containing a fabricated context 
plus the appropriate experimental item imbedded in the proper place. | 

A quota of such simulated ballots was then distributed to each € 
viewer after a sufficient lapse of time to reduce memory. He receive 
the answers in a context opposed to his previous code. Thus, if an inter- 
viewer had coded response No. 6 as “about right” and the main split for 
that response was between “about right” and “too much,” he received 
the answer in a “too much” context. 

Among that group of interviewers who had previously declared it 
item “not codeable,” the concept of a context opposing the — 
code in direction is meaningless. Hence, within this group, contexts e 
two different directions were alternately applied. All the interviewers 
were asked to code the entire set of answers on each of the ballots. 

The ostensible nature of the assignment was a routine survey that 
NORC had conducted, in which we Were trying out interviewers 25 
coders in place of the normal office staff. To reduce suspicion, different 
handwritings had been used, so that no interviewer would receive more 
than two ballots with the same writing. Otherwise, given the knowledge 
of the small field assignments in the usual survey, an interviewer might 
become suspicious. à 

As contrasted with the earlier experiment, the cues creating the atti- 
tude-structure expectations were purely the written contents, rather 
than the combination of content plus all the vocal skills at the disposal 
of a professional actor trying to create a vivid characterization. In this 
sense, minimal expectations should have been operative. However, the 
experiment was pretested on a group of office coders, and where the 
context we had initially constructed was too weak to produce effects 
the context was revised in the direction of a clearer picture, so as (0 
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Strengthen the likelihood that interviewer expectations would emerge. 

As in the first experiment, the measure of expectation effects in the 
aggregate was that the codes assigned to the experimental items when 
they were imbedded in particular contexts shift markedly from the 
Original codes assigned the items when they were presented discretely. 

© measure the differential effects of expectations as related to given 
Variables, the magnitude of shift in coding will be presented for items 
Varying in certain respects. 

These shifts were evaluated in terms of their direction. Where a shift 
occurred from a code involving a definite opinion to the code “don’t 
know,” the assumption was made that this shift was a “half-shift,” since 
the “don’t know” category was regarded as halfway point between the 
two poles of the attitudinal dimension involved. Similarly, where a shift 
Occurred from an original “don’t know” code to a definite opinion, this 
Was regarded as a Һа, since the distance traversed оп the dimen- 
Slon was only half the distance between poles. The assumption seems 
Teasonable, since the category “don’t know” was applied exclusively 

ora respondent whose attitude was definitely regarded as equivocal. 
Vhere the interviewer himself was equivocal about an apparently defi- 
Mite opinion, he presumably used the category "not codeable. 

Vhile these assumptions seem reasonable, such half-shifts are sepa- 
"ated in the presentation of the results, so that the reader can evaluate 
“og findings independent of these possibly indeterminate data or can 
ite any assumption he wishes about the “don’t know” codes 

In Table 13 below, the results are presented for each of the eight 
tems. It is clear that interviewers in large number shifted their classifi- 
Cation in the direction of the presumed context. It is, of course, possible 

at such shifting of judgment is to some extent sheer unreliability, 16, 
“ee given the task of coding the discrete item a second р d 
t ift his judgment even in the absence of context. Unfortunate ys gun 
i measurements of shifting for the repetition of the original discrete 

“ms Were not possible. However, that such shifts were not due to mere 
iP "Iciousness is indicated by the results for control ae On E 
ti DS, 89 Per cent and 100 per cent of the interviewers co edt e п 
. € same as they had previously, despite context. Incidentally, this find- 
d demonstrates that the effect of expectations created by context will 


ү minimal for unequivocal responses. 
айтор ion, comparisons oi De шр à ts indicate that shift 
iu he o Prou] items varying in certain respec au t bi iet 

acto rection of context is correlative with a number o 
ts. This again suggests that such shifts are systematic rather than 


d direction of shifting 
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mere instances of unreliability of coding. For example, it will be noted 
from the table that the effect of the expectation is greater when the 
original response is ambiguous. Ambiguity was measured by the degree 
to which the sixty interviewers disagreed on their original coding of the 
discrete item. If, among those interviewers assigning a definite code, 
there were an equal number coding the item in two different ways, the 
response in question was regarded as maximally ambiguous. 


TABLE 13 


Tue Errecr or Attirupe-Structure Expectations oN Copinc As REVEALED BY THE 
Macnitupe or Suirtinc Wuen Tug Response Is IMBEDDED IN AN EXPERIMENTAL 


Context 
Percentace Snowine SHIFTS 1х THE DIREC- 
OntciNAL Sprit (Per tion or Context ExcLupixG RESPONSE 
Cent) Exctupine Re- “Non-CoDEABLE” 
SPONSES 
“Non-copEABLe” Full Half 
Shifts Shifts Total 
Experimental item 
44-56 34 16 50 
39-61 39 16 55 
29-71 15 29 44 
28-72 23 4 27 
21-79 21 32 53 
0-100 0 22 22 
0-100 8 3 11 
0-100 0 0 0 


This finding on the relation between ambiguity and shifting supports 
the suggestion made in Chapter II and elaborated in Chapter V that 
expectational and other biasing processes are often invoked as task aids 
when the situation is difficult for the interviewer. " 

That such expectations function to reduce task difficulty in coding 15 
also clear from the fact that the equivocal answers when given in a con- 
text are more likely to be assigned some definite code.” This сап be 
shown by comparing the proportion of instances for the total of 344 
experimental responses given the staff as a whole where the interview" 
classified the item as non-codeable under the two conditions. 

In the absence of any context, 34 per cent of all the responses were 
classified as not codeable, whereas in the presence of context only 2 
per cent of the same responses were classified as not codeable. Howeves 
this 9 per cent reduction in non-codeability for all responses in the ag- 
gregate does not adequately represent the full effects of context. While 
some items that had been previously regarded as non-codeable became 
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oe ха pagan of context, other items that were previously 
ohn са is produce a conflict situation for the interviewer 
Insted din P aced in a contest. | | 
pestis nodi Peo ar зы iid so Ree pridem classified a 
elut d уе ем non-codeable. uch c anges implicitly 
not included 3p xi expectations created by the context, but were 
changes seo the earlier table as "shifts." The complete pattern of 
Table bé bel een codeable and non-codeable categories 1s presented in 
OW. 


TABLE 14 


Tue INFLUENCE or Context as RELATED TO Previous CovE- 
ABILITY 
(in per cent) 


Амохс Responses INITIALLY 


Per Cent CLASSIFIED IN VARIOUS REGARDED AS 


Ways iN THE PRESENCE or CONTEXT 
Non-codeable Codeable 


Non-codeable . 
Codeable 


Number of respondents....---+ 116 


Certai Е : 
à ain other findings on the interaction of specific variables in cre- 


at 
Ing effects will be presented below. 
epee р н have presented two experimental analogies to the biasing 
ауе the *- attitude-structure expectations on survey results. These 
Pectationa] vantage of specifying most precisely the nature of вае ех- 
ions dco. effects. As indicated earlier, we can even examine ig 
the Nichi constant over the entire staff, since We have a criterion o 
Ocate the t response. Also, by virtue of the control of the design, we can 
Operate Een aspect of performance through which any such effects 
Very ue d a limitation accompanies all such procedures. The 
tions and E of the experimerits involved the creation of such expecta- 
the res ome element of artificiality. In the more natural field-setting, 
pondent’s answers may not be so well structured, and a host of 


unco 
ntri ; У 
olled situational factors operate.” 
d relate to the narrow realm of 


influence only the recording 
. We therefore turn to a 
ct survey results. In the field 
f the effects, since all com- 
tricably involved. As well, 
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for reasons previously mentioned, it is impossible to measure the effect 
of universally held expectations. However, what losses we sustain me 
compensated for by a more typical estimate of such processes under 
natural field conditions. 

This field experiment is described in detail in Chapter VI. It was con- 
ducted in Cleveland and was one of two large-scale field surveys de- 
signed experimentally so as to permit the Measurement of mie 
results obtained from equivalent samples by different interviewers. ais 
samples were of households rather than individuals, and in 90 per Fes 
of the instances, the housewife acted as the respondent. On two omni E 
questions, certain results for the different interviewers differed so mar à 
edly that one could not attribute the differences to mere sampling a 
tuations. The first question dealt with whether or not the HE 
purchased a series of nine commodities or services, and, if so, whet ея 
the purchase had been made іп the neighborhood, and the second я 
tion was a repetition of the inquiry for the main earner or other maj : 
member of the household. Because of the nature of the sample, the firs 
question almost invariably involved an inquiry into a woman’s beba 
and the second question an inquiry into a man’s behavior. The resu 
are presented below in Table 15. 


TABLE 15 


" GN- 
SIGNIFICANCE OF DIFFERENCE OBTAINED BY INTERVIEWERS WITH EQUIVALENT А551 
MENTS ON Questions RELATING TO Purcnasine BEHAVIOR 


CHARACTERISTIC TESTED | AGGREGATED RESULTS FOR P-VALUE 
| 10 Pairs or Interviewers 
Chi-Squared DF 
“The last time you shopped for , did 
you get them downtown or in neighborhood 
stores?" 

Kl sts once кылу аан aram 30.75 10 001 

Ашогерай.............. Loi 43.21 10 0001 
“Now I'd like to know about the main earner 
(main shopper) of the houschold. The last 
time he (she) wanted any of the following 
things, did he (she) get them downtown or 

in some neighborhood area?” 
СІБИ MNA MEME 24.01 10 01 
Мао 38.04 10 -0001 


n А ; М n- 
Since the actual test made on these items essentially involved cor 
parisons of the attribute “no purchase” plus “don’t remember the pur 

д А a А 5 

chase” vs. purchase for the different interviewers,” the finding show 
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that there is unusually great variation in the frequency with which pairs 
of interviewers obtain an answer indicating a woman making the pur- 
chase of an unusual item, such as gasoline or auto-repairs, or a man 
making a purchase of an unusual item. It is interesting that the item 
Which is least sex-linked, clothing, shows the smallest difference of the 
four (clothing is much more likely to be bought by both members ofa 
family), and that other items in the list for which there is no prevailing 
division of labor between the sexes, buying drugs, patronizing the den- 
tst or movies, ete., show no significant differences. | | 
The Very special pattern of these findings suggests that differential 
tolg expectations among our interviewers as to the buying behavior of 
men and women affected the replies they obtained. Out of forty-five 
Questions tested for interviewer differences, these four plus one other 
question were the only ones on which significant findings occurred, and 
the three of the five showing the greatest effects were items where the 
report of purchase of a given commodity by a man or woman would 
represent unusual behavior. . 
_ That the effects аге not due to the mere content of the questions ог 
items is clear from the fact that the identical question when asked in the 
Context of the behavior of the other sex does not yield a significant 
erence, For example, housefurnishings, when asked in relation to the 
стаје respondent, yields an aggregated Chi-squared of 11.631 which is 
nonsignificant, but when asked about the spouse is highly significant. 
is is difference between the two Chi-squareds when tested kd ха 
Significant at the .05 level. Similarly, when auto repairs was as 
Es male Spouse, the Chi-squared was 12.643 or nonsignificant, and the 
“ “rence between the two Chi-squareds as revealed by an а 
аг е In other words, the identical quam. wipes тиде 9 
Tmodity, only becomes subject to interviewer effect whe 
nt of the question is a person of a particular sex. ved 
the S Might raise the query as to why no differences were observed on 
'€ question of automobile repairs when the referent was a man, or on 


Nousefurnich: F inly items 
“sefurnishings when the referent was a woman. Certainly such 


a g \ г , 
"€ Probably regarded as the exclusive purchasing assignments of the 
; ly linked to role expecta- 


Кренке Sexes. Such questions are obvious Loc sib ne 
ns. The answer lies in the feature of field experiments to wi 
Previously referred. There might well have been expectations that € 
items were bought exclusively by men or women, which might we 
ave inflated the f requency of reports of purchase of these items for the 
сое е ауаз the entire sample. But since these € ae hd be 
cristic of both interviewers who were compared, they would not 
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be revealed. For example, it is hard to believe that any interviewer 
would think that a woman did лоғ buy housefurnishings, or that a man 
who owned a car did zot buy gasoline. However, with respect to items 
that are unusual purchases for a given sex, it is likely that fairly often 
one but not the other of the interviewers would assume that a number 
of women purchased gasoline, or that a number of men purchased 
housefurnishings. 

That interviewer effects operated on these questions in the Cleveland 
study is beyond question. The explanation given in terms of role ex- 
pectations seems plausible, but no real proof has yet been presented. In 
contrast with the laboratory-like experiments presented earlier, we did 
not experimentally create any expectations among our interviewers un- 
der controlled conditions to which we can point. We merely observed 
their behavior in the natural setting and inferred the operation of certain 
expectations from the peculiar contents of the findings on certain ques- 
tions. 

However, if it can be demonstrated by refined analysis that these re- 
sults vary in an orderly way among interviewers differing in role-ex- 
pectational tendencies, the inference would seem well supported. A 
series of such analyses are available, all providing support for the 1n- 
ference. Certain selected ones are presented below. It should be noted 
with respect to these analyses that it was impossible to find enough in- 
stances of contrasting characteristics within the pairs of interviewers 
who had equivalent assignments. А 

Consequently, it was necessary to lump together the results of all in- 
terviewers with a given characteristic regardless of the blocks from 
which they had obtained their interviews. Thus, if the observed differ- 
ences are interpreted in the light of random variation resulting from 
simple random sampling, it is possible that some seemingly significant 
differences may merely be due to chance, i.e., due to true differences 
between the samples of respondents assigned to the contrasted inter- 
viewers. These errors of interpretation result from the underestimate of 
the potential extent of variation between aggregates of clusters of re- 
ve are here relating various interviewer charac- 
the obtained interview results, it is necessary t9 
ation in results between interviewers with the 
same characteristic(s). The assumption of simple random sampling 
€ certain fortuitous observed differences to vari- 
viewer variable when in reality that interviewer 
y related to that type of difference at all. How- 
omogeneous area like that studied," there is no 
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геаѕо! ; 
к ы _ ө ш.) great spatial serial correlation of sexual 
de m үк о регһар$ the assumption of simple random sampling 
ах ate ж е es tests is not completely unfounded. We have no 
fuos апаты €: t here isa correlation of any sort between the inter- 
spondents of tii еше characteristics, and we can consider the re- 
раене Wha ae Ee ers with different characteristics to be reasonably 
tute terio .0 x аы. underestimate the sampling variance be- 
ong eistelete] groups but probably not enough to invalidate compari- 

More r es 
for bongs , “ the analyses that follow, the data are presented purely 
бега Doni ei i n of common characteristics, thus ruling out 
үү л variation as the explanation. For example, all 
Interviewers ба nted purely for female respondents. In addition, the 
Strengthening a va contrasted are matched in certain respects, thus 
independent he a ke ihood that the differences observed are due to the 
Tat the os ‘iable specified. . 
roles is first d in results are related to реш about sex- 
requently re pported by the fact that “unusual” purchases are more 
holds where pontea by interviewers who themselves come from house- 
Nterviewers vih sex-roles are unusual. This is shown below for women 
Purchasing b ag. had reported in an interviewer's questionnaire on the 
g behavior in their own households. 


TABLE 16 


Ts 
ER 
ELATIO: 
н Reports OF Puncuasixc-BEHAVIOR 
OLE то Sex-Rorrs IN INTERVIEWER'S 


Tuar VIOLATE THE USUAL Sex- 
Own HousEHOLD 


Амохс FEMALE RESPONDENTS, 
PERCENTAGE OF HUSBANDS 
REPORTED AS PURCHASING 

HovsEFURNISHINGS 


| mmi ee 
For; 
T intery; N 
erviewer; 
“rnishings s whose own husbands purchase house- 
uina ROSS аа E Wis NW a с жй 60 67 
9r interi 
erviewe; 
rs whose own husbands do not purchase 
45 307 


Ousefurni 
u : 
rnishings 


Амохс FEMALE RESPONDENTS, 
PERCENTAGE REPORTING 
Gertinc Autos REPAIRED 


Е 
9r female бей N 
rviewers who had had autos repaired... - ++ 46 328 
38 117 


or fe 
male і А 
nte 
rviewers who had not had autos repaired... 
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The expectation about the behavior of the respondents and their 
spouses would thus seem in part to be predicated upon the real but 
idiosyncratic experiences of the interviewer. However, it has also been 
argued in Chapter II, and is supported by a body of theory, that gu 
categorizing of respondents’ answers in terms of gross group member- 
ships would be related to general tendencies to be stereotypic. We find 
that this is the case. Interviewers were asked if there were certain types 
of people they would object to interviewing. A small group stated that 
they were unwilling to interview Negroes, and this response was taken 
as an index of stereotyping. In Table 17 below it can be seen that these 


interviewers are less likely to obtain reports of behavior that violate the 
usual sex-role. 


TABLE 17 


Р А 3 Vas x- 
Tue RELATION or Reports or Purcnastnc-Benavior T HAT VIOLATE THE UsvaL SE 
Rote то INTERVIEWERS’ SrEnEOTYPICAL TENDENCIES 


Among Female Respondents, 

Percentage of Husbands Re- 

Ported as Purchasing House- 
Furnishings 


Percentage of Female Re 
spondents Who Reporte 
Obtaining Auto Repairs 


Among Professional Female 
Interviewers Who: 


Ш 


e 

1 

© 

N 

a 

ao | 
Z 
Wu 
э 

о 


Refuse to interview Negroes....... 43 N-69 N = 83 
Are willing to interview Negroes... 


The theory was advanced earlier th 
likely to be invoked in the 
function as aids in the resoluti 


can be supported in the analysis of the Cleveland study. About half of 
i i i Se questions and indicated that 


at such expectational processes are 


they were among the “least 


difficult to understand" or the "hardest to answer." Among this group: 
the frequency with which unusual purchases were reported was less. Ít 
is Suggested that, in the presence of difficulty, interviewers are more 
likely to record an answer on the basis of ex 


ai at have a highly organized 
dingly well for research design purposes but may 
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и сао consequence for interviewer effect. In the Cleve- 
is SET > е 1 à situation seemed to be present. Prior to the question 
nepera purchases the respondent had been asked what mode of 
T a n as used to do the food shopping. If the respondent did 
tional effects un con sex i ў di oe “i ps ue ка red 
by lis . d pairs are re dte to the characteristic reported 

respondent on the earlier question. Thus, for example, stereo- 


TABLE 18 


ATION or Reports or Purcnastnc-Benavion THAT VIOLATE THE Usvar 
Sex-Ro1e то SITUATIONAL PRESSURES 


Tue Rer 


PERCENTAGE AMONG FEMALE INTERVIEWERS Wmuosk 
Reaction то THE Question Was 


= Negative | М | E 
Perce : 
die o of female respondents having 
Eres, a Kal See eRe aay ray Same ARN так. ow 37 197 50 248 
ML of husbands purchasing house- 
Орвар op iam aes SHER мез эе ы 40 161 53 213 


respondents ar l rep. auto repairs from female 
if the ж 5 > constrained to obtain increased reports of auto repairs 
t can ah. ent had previously indicated that she had or used an auto. 
acteristics a noted from the table that even when we control the char- 
Stereotyp; e. the respondent by reference to the earlier question the 
ja уре Interviewers are least likely to obtain deviant reports. 

binis oho power of the theory that the Cleveland findings a 
shown de rre рана tendencies activated by task difficu ү is 
here is а у able 20. Among interviewers where the two factors combine 
ninimal report of unusual behavior. 


hu А 
“ie far, we have described several experim 
€ the biasing effects of role or attitude-structure expectations on 


Sur 
"d м We earlier alluded to a third type of expectational proc- 
Suggestive or ability expectation,” and turn now to some арса! data 
Tom а va Ae such expectational effects. The data to be т аге 
nomeno, riety of sources and only fragmentary partly becanse t a 
experime Was not explored early enough to be fully vp rae into 
Siren a phases of the project and partly because this type of 
Worthy > is clearly of secondary importance and therefore not as 

It Sue research priority. 7 | 

be anticipated that probability expectation 


typi 5 
€ interviewer А 
terviewers who obtain few reports of 


Onst ental studies which dem- 


s will be difficult 
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to demonstrate. For interviewers to expect a particular distribution of 
attitudes in a sample requires that the object of the attitudes, the issue 
involved, be exceedingly well known. On ephemeral issues, which con- 


TABLE 19 


Tue RELATION or Expecrationat EFFECTS ТО Siruationat Factors or QUESTION- 
NAIRE ORDER 


Амохс Femare Resronpents Wio 
Generatty Usep Auto то Suop FOR 
Fooo—Percentace Rerortinc HAVING 
Auto REPAIRED 


Per Cent N 


Professional interviewers not willing to interview 
Ме ев, анаа эчик FAL ыд, nore ла eure etm дн 62 37 
Professional interviewers willing to interview 


7 PING 
^ Car AvartAnLE ror Foop Suorrt 
—PrnckNTAGE ReronTING HAVING 
Auto REPAIRED 


Per Cent N 
ELVIS AM ER UNT 

Professional interviewers not willing to interview 

ЧЕДӘН, ory erie sto a i TR ay econ ыо икс orice x, 13 15 
Professional interviewers willing to interview 

ncc MT MN 50 38 
Nonprofessional interviewers (all willing to interview 

INSGEDES) s. cases ena adam odeur гө жинак. 65 34 


Амохс ЕЁкмльЕ Resronpents W но 
Dip Nor Have Ax Auto AVAILABLE 
For Foop Suorpinc—Percentace RE- 
PORTING Havinc Auto REPAIRED 


Per Cent | N 
Professional interviewers not willing to interview 
NEBR OCS MUNERE 6 31 
Professional interviewers willing to interview 
Megroesseses Bis ne aa nomics e $64 dare sn aan ae 13 75 
Nonprofessional interviewers (all willing to interview 
Маркез) verses мәни Hie tec ы | 18 57 


stitute a considerable part of the c 
there would be little basis in experi 


viewers to build up such expectations, Of course, on issues that аге 
central in the culture, 


for example, approval of polygamy or private 
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m н ЗЕ but prominent matters, such as Truman’s 
such m - , we would expect strong probability expectations—but 

"np dis ae encountered too frequently in social research. 
Pio cus € we W ould anticipate that such expectations would be 
егеда 1: ad operations. They are tentative in relation to more 
sequent expectations established as a result of inter- 


"Tus. G TABLE 20 
E Сомвіхер E 
BINED Errrcrs or Rote EXPECTATIONS AND SITUATIONAL Dir} 
Reports or Purcuastnc-BEHAVIOR 


FICULTY ON 


Амохс FEMALE RESPONDENTS, Per- 
cENTAGE oF Maes REPORTED AS Pur- 
CHASING HoUSEFURNISHINGS 


Among Female Interviewers Per Cent N 
mete react а е and whose own _ 
ho did ска housefurnishings. . «+++ -+- 70 47 
мее react negatively, and whose hus- 
ho did not purchase houscfurnishings. ..-- + - 48 166 
эшет react negatively, and whose own hus- 
3 ишсе house: urnishings..... - 00 35 20 
react negatively and whose own hus- 
40 141 


bands do лог purchase housefurnishings. . . -+ 


ESPONDENTS, PER- 


Амохс Femate R 
Havin Hap Аото 


CENTAGE REPORTING 


Am REPAIRS 
‘ong Female Interviewers 
Per Cent N 
Who di pem) 
E did not react negatively, and who had auto 
PRRs tira sine ло we аз шї ша ере ae 56 166 
SSH not react negatively, and who had not 
ad аш берей... etme age ix orna sent HAE 40 82 
О reacted negativel H tó vi is pe" 
uestion, and who 
a had auto терайгей, А = oo pate сесин REA 38 162 
reacted negatively to question, and who 
ad not had auto терген mieia шая ваб Е Жен 34 35 


acti " 

9 adus particular respondents. While the interviewer might expect 
Чоп hold i of ten respondents would vote a certain way, this expecta- 
Sarily "€. OF the general run of results over the sample and is not neces- 
Of a ра Intained for a particular respondent he confronts. The behavior 
expec, Tticular respondent might conform to the more differentiated 

ation about a given subgroup or about a person with a given type 


о . 

S . . 

More 2 de-structure. Consequently, probability expectations would be 
uid and elusive and would often no 


Subsets of t correlate with particular 
о + в. . 
results obtained by interviewers. 


The extreme of this would 
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occur under conditions in public opinion research where an interviewer 
interviews a particular homogeneous cluster, rather than a sample of the 
total universe. In such instances, the interviewer might well regard his 
probability expectations as irrelevant to his entire assignment. | E 

Where probability expectations are strong, and yet in conflict w e 
more differentiated expectations for particular respondents, we cou 
conjecture about a model that might operate in the intervienen. pue 
sumably he would surrender his probability expectations р to a отат 
point in his assignment because they seem less appropriate and valid than 
his more pointed and specialized expectations. But then in so far as he 
felt that the total body of results should conform in some degree to his 
probability expectations, he might then feel that he has accumulated too 
few results of a certain type. He might then do violence to the subsc- 
quent individual respondents and even reject the more individualized 
expectation about any case. Thus, where several interviewers have Ll 
mon probability expectations about a well-known matter, one might 
even find if they interviewed the same individuals that they arrive at e 
same set of marginal results, despite the fact that they disagree on many 
individuals, since these can be ordered in any conceivable way so long 
as the final accounting is correct. 

If this argument is cogent, it would seem that the most insidious types 
of interviewer effect might occur just in this realm. Marginal results 
could be highly uniform over interviewers and subject to no unrelia- 
bility, and a false sense of security would prevail. But the real meaning 
of the finding might lie in universal expectational effects plus gross in- 
accuracies at the level of subsets of results or results for any respondent. 

"This model seems to conform to a common finding in panel studies 
when sets of interview data collected by different interviewers from the 
same respondents are examined. It is of 1 
agreement in the marginal distributions obtained by the two interview- 
ers, but considerable disagreement in the cells of the table, i.e., in the 
classification given the individual respondents by the two interviewers. 
The interpretation usually given to the finding is that the error origi- 
nates out of some process that is random in character and therefore that 
the net result of the system of compensating errors is an unbiased set of 
marginals. Therefore, the evaluation is commonly made that marginal 
totals are accurate, but that one should be cautious about the accuracy 
of measurement at the level of the individual. This interpretation of 
such findings and the evaluation of them certainly is appropriate gener- 
ally. To invoke the operation of probability expectations and conse- 
quently to evaluate the marginals as biased seems unwarranted in most 


ten noted that there is unusual 
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instances. While probability expectations must be widespread, it would 
be rare that different interviewers would share expectations with the 
Same content. Moreover, this very phenomenon of common marginal 
findings, despite internal differences in the cells, occurs in repeated 
measurements obtained from self-admiinistered questionnaires. Here the 
phenomenon is obviously a function of sheer unreliability and by defi- 
nition has nothing to do with an interviewer. However, the alternative 
explanation that apparently reliable marginal findings may represent the 
effect of common probability expectations might well be considered in 
the special instance of studies involving questions where there is а well- 
established prevailing view. A set of data suggestive of this phenomenon 
Is available from the methodological work done in connection with the 
Psychiatric assessment of RAF personnel alluded to in Chapter {ы 
Through a detailed card index, a record was available on all members 
н. еш crews who had been referred to a RAF neuropsychiatrist by a 
Station medical officer. This record contained the opinions of the psy- 
chiatrist plus certain factual data. Tabulation revealed that 541 of the 
Approximate 5000 total cases were found to have been seen by more than 
Опе of the thirty-seven staff specialists. Analysis of the reports filed on 
the same individuals by two different psychiatrists provided general data 
= Uns reliability of assessment, and material in the specific form з m. 
| опг model of probability expectations. In examining these mate ‚ 
the reader should not regard the level of reliability as typical, since the 
thes that two or more diagnostic opinions were solicited s i 
Nese were unusually difficult cases. Moreover, the mere fact that E 
Man was referred by the station medical officer for any opinion at al 
Suggests that the case was more than an ordinary case. However, the 
that this was a clearly defined abnormal population makes a he 
—1 appropriate for our purposes, since the psychiatrists wou 
More likely to have well-structured and common expectations. с 
а р for the difficulty in diagnosis, one can, edam teense d 
Com i Would increase the reliability. The two observ ma m 
ia Pietely independently; the second psychiatrist frequer Li iow 
Statement of the first psychiatrist's general opinion availa E + n 
mer this information should have worked Vue m ibus ew 
Fd ment in judgment of the individual cases, rather rapis irn 
arity of marginal distributions, our major concern : à 

Table 21, reproduced from the original report, shows that the agree 
rej, the marginal distributions for major пе чор 
Specifi ably high, despite the fact that the two po a i S 

€ diagnosis given to 19 per cent of the individual cases. 
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Several other characteristics besides the general diagnosis were ana- 
lyzed and reveal this same phenomenon of great agreement in marginal 
totals despite considerable differences in opinion on the individual cases. 
For example, in assigning the cause of the disorder to flying duties or in 
rating the degree to which the individual had experienced stress as a re- 


TABLE 21 


Reaction Types: Tug Number оғ Cases DiacNosED SIMILARLY or DissIMILARLY BY 
Two Dirrerent Psycuiatrists AMoNG RAF Arr Crews 


Diacnosts or First PSYCHIATRIST 
2 2 $ E 
DiaGNosts oF e lE] alata]? E £ $ 18 
Ѕесохр Рѕусшлтвіѕт| > $ $ S| &| $ 1 © £ |45 Тота 
е X = 2 te а zm 
< |° Slo раце ре 
Anxiety state. ...| 346 | 13 | 0 | 12] 3] 1 0 0 1 [13 | 389 
Depression. .....| 14 | 34| 0 3| 0] © 0 0 0 0 51 
Elation..... | 01 о 6o 0| о[ о 0 0 0 0 0 
Hysteria. uces oss 17} 1/0 |32| 0| о 0 0 0 1 51 
Fatigue syndrome 51 0| о 0| 10| 0 0 0 0 0 15 
Obsessional.....| 2| 1| 0 о o| 4 0 0 0 0 7 
Organicacute....| 0| 0| о oj oj o 0 0 0 0 0 
Organic-chronic..| 0| 0| o о о| 0 0 0 0 0 0 
Schizophrenia....| 0| 0| 0 о oj о 0 0 0 0 0 
Lack of confidence} 13 | 1| о 2} о| о 0 0 о | 12 28 
p: ож әре 397 | 50} 0 | 49/13] 5 0 0 1 |26| 541 


sult of flying, the detailed tables presented are of the same order. In such 
a situation, where there is a specialized and clearly defined population, 
abnormals, plus considerable past experience of rates or incidences ОГ 
features in that population, one would expect probability expectations 
to be especially operative. They might well lead the interviewer or 
judge or clinician to confirm again the findings of the past, and, in this 
sense, constitute an example of what Merton has referred to as the “self- 
fulfilling prophecy,” “a false definition of the situation evoking a new 
behavior which makes the originally false conception come true. The 
specious validity of the self-fulfilling prophecy perpetuates a reign of 


error,” 


The earliest methodological research in 


bility expectations in social research 
Stanton and Baker.“ 


clear, upon reflectio 


to the biasing effects of proba- 
was an experiment conducted by 

While the concept was never explicitly used, it is 
n, that this was an inquiry purely into probability 
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expectational processes, experimentally created in a laboratory-setting. 
Five professional interviewers with at least one year of field-work ex- 
perience were hired and instructed that they would query a group of 
two hundred students presumably to test their memory. The students 
had previously been shown a series of geometrical symbols, and the in- 
terviewers were required to present each such symbol again in conjunc- 
tion with a new one and to determine the respondent's ability to recog- 
nize the correct one. Probability expectations were covertly created by 
giving each interviewer a “key” attached to his questionnaire, which 
Presumably indicated which symbol had actually been shown the re- 


Spondents originally. The materials were so arranged that the inter- 


viewer was compelled to look at the key each time in order to note 
h true and false 


the response, In point of fact, the keys combined both tr - 
information, but it was verified experimentally that the interviewers 
lieved in the accuracy of the key. . | 
Tt is clear that this procedure was likely to create in the interviewer 
Some expectation as to the frequency of “yes” and “no” answers that 
Would be encountered for each symbol in the series. The effect of this 
Expectation in biasing the results was determined by comparing the per 
Cent of actually correct answers obtained in the sample when the inter- 
сев believed that the symbol had been previously seen VS. let 
ent obtained when the interviewers believed the figure had not bee 


Previously seen. The results were significantly different depending on 


the expectation created. ^" 

АЕ analogy of the task in this experiment to onc ene ex- 

үне to various kinds of media in market research surveys е н 

з Suggests that probability expectations might well jube DUE i 

ter ws One specific example of this very fact is = abere 

in Maru. it is shown that interviewers, ed ae t 
ing magazine exposure, obtained di 


k 

pe, of the fake items increased.” —" specific Рр 
the bia су conducted by Wyatt and Сатре P opinion surveys. A 
Surve Sing effects of probability a 4 fion was conducted 
in Col on sentiments about the 1948 presidential a viewers from the 
Unive aec. Ohio, in May, 1948, by 223 sone pris hical cluster. 
in w ‘sity. Each interviewer was assigned 2 en ace Ra selected 
опа Ich he was to obtain interviews with twelve хар ca 
tog кы ая basis. The results obtained -— ма c note 
bility 106г of potential biasing factors, among ^ determined by hav- 


ing о expectations of the interviewers. These were di 
ach student estimate, in advance of his work, the percentage dis- 
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tribution of answers to five of the questions. These concerned degree of 
interest in the campaign, whether the respondent talked about the cam- 
paign with others, the media affecting his thinking on the campaign, 
whether the respondent had a favorite candidate (but not which one), 
and his general party preference. While it appears as if the general area 
of sentiments studied, political sentiment in the 1948 election would 
lend itself to the growth of expectations, the specific questions examined 
do not seem to be ones where knowledge would be precise enough to 
lead to strong expectations, with the possible exception of the party 
preferred. 

(For this latter issue, expectations were fairly pervasive as indicated 
by the result cited in Chapter II.) Moreover, the clustering of assign- 
ments would suggest, as previously indicated, that the probability ex- 
pectation for the entire population of Columbus might not be a potent 
source of bias, since the more differentiated expectation relevant to the 
subgroup, €.g., "people in a poor neighborhood," “people in the Negro 
area of town," would be likely to take precedence in guiding the inter- 
viewer. 

For these reasons, the study provides only a weak test of the effects 
of probability expectations. However, in possible opposition to these 
considerations, a factor that might enhance the operation of bias in the 
results is the generally poor quality of the field staff and their lack of 
motivation. Most of the students had no previous experience and worked 
without pay on the survey as part of a course requirement. That the 
quality of their performance was not too high is suggested by the fact 
that only the 1,155 returns from 100 of the 223 interviewers were used 
for the methodological study. The majority of interviewers were ех- 
cluded either because they did not complete their full assignment or had 
falsified interviews. However, it is conceivable that the screening out of 
the Worse group does leave in the analysis only a superior, relatively 
conscientious, and relatively unbiased group of interviewers. 

The results for interviewers varying in their expectations were com- 
pared and tested for significance. The Summary results for the five 
questions are presented in Table 22 below. In the column labeled “di- 
rection," a plus sign indicates that the results were biased in the direc- 


tion of the respective expectations of the contrasted group of interview- 
ers, 


, from inspection of the results, it appears to us 
understate the significance of the effects. Taken 
У suggestive in that four of the five 
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uesti я ; 
em ath ec results i the direction of the interviewer’s expecta- 
ран о ста ls els below .20. In addition, the tests understate 
ашса ee hey were two-tail tests, indicating the probability of 
Pons ae = that magnitude iz either direction. The likeli- 
ечи "а ee a Е of that magnitude, but in one specific 
ке core : ent v sampling is obviously much less and seems 
isle p ae or ev aluating the hypothesis that interviewers obtain 
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satisfaction with the neighborhood in which the respondent lived, and 
differences were found in the frequency with which “kind of neigh- 
bors” was given as the primary reason. 

Prior to the survey, interviewers had reported their own rating of the 
importance of “neighbors” in deciding upon the neighborhood. This 
rating can be taken as a crude indicator of probability expectations. 
While the interviewers were not asked to specify the exact distribution 
of answers in the various reason categories, it seems reasonable that those 
interviewers who rated this reason as “very important” are expressing 
the belief that this is likely to be the focus for the attitude about the 
neighborhood. The results for interviewers contrasted with respect to 
the belief that neighbors are important differ in the direction of the 
hypothesis, although they do not reach the usual level of significance. 

A limited test of the hypothesis that probability expectations are ten- 
tative and would be surrendered in the face of more differentiated ex- 
pectations was available from the study, described earlier, on bias in 
coding due to attitude-structure expectations, experimentally created by 
imbedding items within false contexts. The interviewers who coded the 
responses had previously estimated which answer category would be the 
majority position in the population.” To test whether differing proba- 
bility expectations are effective when in conflict with an attitude-struc- 
ture expectation, we examined for a number of items the amount of shift 
in coding due to context for interviewers contrasted in their expectation 
as to the majority answer to the question. In other words, for one group 
of interviewers, the attitude-structure expectation was consonant with 
their probability expectation, and for the other group the two expec- 
tations were opposed. The differences were nonsignificant, suggesting 
that probability expectations are only weak and tentative in relation to 
expectations predicated on more specific cues in the particular inter- 
view. This result, of course, must be qualified in the light of the fact 
that the contexts were perhaps more extreme and well structured than 
might be the case in some normal interview situations. 

A considerable body of evidence has been presented that expectations 
of various types do exert a biasing influence on survey results. This 
confirms the theory developed in Chapter II on the basis of qualitative 
material that cognitive factors, hitherto neglected, are of great impor- 
tance in understanding interviewer effects. However, in Chapter II, 
such a theory was also contrasted with the more traditional view that 
bias arises in public opinion research through the communication to the 
respondent of the interviewers own ideology, or through the inter- 
viewer s motivation to influence the results in conformity with his own 
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ideology. It might be argued that some of the evidence presented im- 
plicitly supports the traditional theory about ideological determinants 
of bias, in so far as expectation and ideology are not independent. It is 
well known that perception is determined in part by such functional 
factors as needs and attitudes, and one might therefore construe these 
€xpectational effects as simply the vehicle or carrier of the interviewer's 
ideology. This view, of course, has little applicability to expectational 
effects in “factual” surveys. One would be hard put to think of an in- 
terviewer’s own opinion or ideology being activated on questions hav- 
ing to do with the possession of certain equipment or the employment 
Status of the respondent or the store in which a purchase was made, 
except in the very remote instance where such factual data may have 
Some evidential value in the resolution of controversy. With respect to 
Such matters, it is perfectly plausible that an interviewer may entertain 
expectations about the answers, but it is unlikely that he is motivated by 
his Opinions to affect the results in some particular direction. This con- 
Sideration points to a fact not previously emphasized that expectational 
Processes have more general applicability or subsumptive power in ex- 
Plaining interviewer effects in social research than ideological factom: Р 
f the ideology were really primary, it would make considerable dif- 
erence in the inferences we would draw from such experimental re- 
Search and might change our whole approach to the control of these 
effects. We will shortly present a body of evidence from experimental 
tests of the effect of the interviewer's own ideology on SIVE pecu 
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Travers, the co-efficients ranged from .02 to .98 with a median value of 
.42.5 Additional evidence directly relevant to the correlation between 
probability expectations and interviewer's ideology is available from a 
study by Clark. Students in a course in public opinion estimated the per- 
centage distribution that would be obtained in answer to a series of 
questions. In a preliminary study, they were also asked to record their 
own opinions. The relationship between personal opinion and probabil- 
ity expectation was only moderate.” The Wyatt and Campbell study 
also computed the relationship between interviewer’s own opinion and 
probability expectation for each of the five experimental questions. The 
value ranged from .13 to .27.°° Thus, the relation between interviewer 
ideology and expectations, as inferred from these empirical studies, 
would seem moderate at best. This is not to deny that, iz general, cog- 
nitive processes are affected by motivational factors. We have too much 
experimental evidence in support of the general finding. Also certain 
projective tests, particularly error-choice tests in which an individual 5 
attitudes affect his guesses оп questions of “knowledge,” imply a relation 
between expectation and attitude. However, the evidence cited first 
Seems more specific to the interviewer population, the survey situation, 
and the type of expectations generated within an interview. 


3. EXPERIMENTATION ON IDEOLOGICAL PROCESSES 


We have thus far demonstrated the significance of certain beliefs 
within the interviewer that create expectations, which in turn bias sur- 
vey data. Since these beliefs are virtually independent of the interview- 
er's own ideology, such biasing effects can therefore not derive indi- 
rectly from ideological processes. However, as noted in the foregoing, 
the classical view of interviewer effect in public opinion research 1S 
that the interviewer's own opinions are a major biasing factor—oper- 
ating upon the data either through the communication of the opinion 
to the respondent who then alters his response, or through the inter- 
viewer's distorting of the questioning or recording, so as to obtain 
results in conformity with his own opinions. The phenomenological 
materials presented in Chapter II already cast doubt on the plausibility 
of this theory. Respondents appear to be insulated from such commu- 
nications for reasons of apathy, egocentrism, and the like. Interviewers 
seem to be task-oriented rather than straining for particular answers. 
Nevertheless, the prevalence of this theory, plus past research purport- 
ing to prove the significance of interviewer ideology, required that we 
investigate the problem directly. Therefore, a whole series of quantita- 
tive tests were conducted; all of these essentially yielded negative find- 
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questions and recording tasks are treated in Chapter V under the discus- 
sion of situational determinants. 

A second laboratory experiment of similar design was conducted by 
Fisher, and provides evidence on ideological bias in the recording of 
free-answer questions.? Student interviewers asked a limited number of 
questions which were answered by Fisher, playing the part of the re- 
spondent. The interviewer, it should be noted, asked each of the ques- 
tions a series of times and obtained each time a different, but long and 
tortuous, answer which was to be recorded verbatim. The task therefore 
had some of the elements of a repetitive training exercise, rather than 
the variety characteristic of a real interview. The total answer to each 
question was composed of elements, each of which expressed a favorable 
or unfavorable sentiment on a given issue. By scoring the recorded ques- 
tionnaires in terms of the distortions and omissions of given elements, 
Fisher could determine whether the errors were predominantly in one 
direction. By correlating the direction of such distortions with the in- 
terviewer's own opinions, Fisher could test the general hypothesis. 

His general results support the hypothesis that interviewers selectively 
record answers in the direction of their own ideology. However, this 
finding is limited to the recording of very long and complex free-an- 
swers in the context of an unusual interview involving the repetitive 
asking of the same question. This suggests that the hypothesis has va- 
lidity only in rather specialized situations where the interviewer is 
confronted with serious difficulties or where the task is of such a nature 
that motivation detrimental to performance develops. 

This suggested limitation upon the operation of ideological bias was 
confirmed in a field experiment on the influence of ideological factors 
on the classification of equivocal answers. The experiment is discussed 
in detail in Chapter V.* In summary, the design involved the analysis of 
the results obtained by interviewers of contrasting opinions operating 
successively in two situations. In the first situation, a question form was 
used which was likely to increase the number of highly equivocal an- 
Swers, whereas in the second situation, the question form used reduced 
the difficulty in classifying the answers. The results indicated that ideo- 
logical bias occurs only in the situation where ambiguity of response 
creates difficulty for the interviewer in completing his task. 

Other large-scale field experiments conducted in the course of our 
studies show no evidence of the general operation of ideological bias. In 
the major experiment in Cleveland, where role-expectational effects 
were demonstrated with ten pairs of interviewers, each pair receiving 
equivalent assignments, no differences in results could be demonstrated 
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second interviewer had a different ideology from the first, that the di- 
rection of shift in the respondent is unrelated to the type of change in 
interviewer ideology. 


TABLE 23 


SHIFT iN PRESIDENTIAL PREFERENCE IN ELMIRA As RELATED TO THE IDEOLOGIES OF THE 
INTERVIEWERS Usep on Successive Waves 
(in per cent) 


Амохс Respoxpents IN Exasura Wuose Successive INTERVIEWS 
Were CONDUCTED BY 
PercentaGE or Responpents Мно Republicans Democrats 
Republicans First, First, Democrats 
Both Waves Democrats Republicans Both Waves 
Second Second 
іб посе, „ал cw» ase 78 79 77 75 
Shifted toward Republican’. 11 11 11 9 
Shifted toward Democratic*. 11 10 12 16 
100 100 100 100 
N=149 | N=187 | N=56 № = 69 


* A shift toward Republican was scored for any of the following patterns: from Democrat to Republica” 
from Democrat to “Don’t know,” from “Don’t know” to Republican. A shift toward Democrat was score 


^ ^ t x i m lt 
any of the following patterns: from Republican to Democrat, from Republican to “Don't know,” from "Don 
know" to Democrat. 


All this evidence is not to suggest that the interviewer's own ideology 
never influences the results he obtains. It merely demonstrates that the 
hypothesis has little merit for the run of conditions characterizing pub- 
lic opinion research in general. For example, it does have merit under 
specialized conditions, such as those where the situation confronting the 
interviewer creates difficulty. The appropriate direction for future re- 
search into interviewer ideology as a biasing agent is toward greater 
complexity—toward specification of these conditions. The theorizing 
behind such specification can come easily out of the kind of analysis 
made in Chapter II of the nature of the experience involved in an inter- 
view. 

This approach to the study of ideological bias can be illustrated by 
one model, developed in connection with our studies, in which ideo- 
logical factors are hypothesized as operating basically under rather p 
culiar circumstances." We argue no great merit for the variables in this 
particular model, but the formal nature of the approach seems to us the 
appropriate one. We start with the view that the interviewer may dis- 
tort the results in the direction of his own opinion only in the situation 
where some difficulty is felt. Yet since our phenomenological data sug- 
gest that ideology does not seem to work through the process of com- 
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municating the opinion to the respondent, it would probably operate 
basically through cognitive processes whereby the interviewer appraises 
the respondent in some biased way. Presumably, the mechanism of pro- 
Jection would be at work, and the interviewer would see the respondent 
as having an ideology something like his own. Yet, our phenomenologi- 
cal data suggest that the interviewer organizes his behavior in a more 
objective manner and that his expectations arise in other ways. Projec- 
ton would be constrained to some extent by such factors. Thus, for 
ideology to work via the mechanism of projection, the projection would 
have to contain some logic, some relevance. We therefore theorized that 
the expectation about the respondent would be a projected one, mirror- 
ing the interviewer's own ideology, only where the respondent was of 
the same sex as the interviewer, and where the content of the issue has 
Some sex-linkage.®* In other words, the vebicle for ideological bias is an 
©хресгайоп; the precipitating factor is situational difficulty; and the 
Specialized circumstance is that the projected expectation has some ap- 
Parent relevance such as being appropriate to the sex of the respondent 
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The data presented thus far only give suggestive support to the model. 
To strengthen the theory, it would be necessary to demonstrate that 
among these same interviewers, the data for respondents of the other 
sex do not conform to the pattern, and to demonstrate for other inter- 


TABLE 24 


IpEoLocicat Bras As Ілмітер sy SITUATIONAL DIFFICULTY 
AND Projection To Like-Sexep RESPONDENTS 
(Deviation in degree of involvement in presidential politics from 
Mean Value for equivalent respondents expressed only for those 
respondents who are the same sex as the interviewer) 


Амохс Interviewers Anticipatinc Osjection Мно ARE 


FEMALE Interviewers Mate INTERVIEWERS 
Interviewer Interviewer Interviewer Interviewer 
Attaches Attaches Attaches Attaches 

Great Deal Less Great Deal Less 
Importance Importance Importance Importance 
15 =,47 —.29 —.26 
42 —46 45 05 

pg 297 23 
—.03 73 
1 
25 
29 шы) ү. 37 .01 


viewers who anticipated ло difficulty that the data for either sex-group 
follow no pattern. The materials are too elaborate to present, but 1n 
general they support the model. 


4. THE RELATIVE SIGNIFICANCE OF EXPECTATIONS AND 
IDEOLOGY AS BIASING FACTORS 


The general findings presented thus far on the importance of expec- 
tational processes and the insignificance of ideological processes can be 
shown very neatly in some studies where the two factors have bee? 
studied simultaneously. The contrasting of findings on these respective 
factors when the findings are not predicated on the same set of condi- 
tions involves a considerable element of arbitrariness. The respective 
findings may have been predicated on interviewing staffs differing 18 
competence, on surveys varying in difficulty of execution, оп samples 
varying in suggestibility, and the like. By analyzing these two sources 
of bias simultaneously, we control such extraneous factors in the com- 
parison. Incidentally, we can often examine each process controlling the 
other and establish their relative importance as primary factors. At 


times, we can also see what the total additive biasing effects of both 
factors are. 
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when the effect of expectations is controlled is negligible. This can be 
seen by comparing the results which interviewers with contrasting 
opinions assign to the same respondent. The change in results at most 
is 17 per cent.” On the other hand, the independent effect of expecta- 
tions when ideology is held constant is great. This can be shown by 


TABLE 25 
Tue Revative IxrLvENCE oF Oprxion Versus Expectation on Соріхс оғ RE- 
SPONDENT's ANSWER TO QUESTION 7 


Ѕовјестѕ Мно Cope THE 
Answer CORRECTLY INTO 


“Ricnt Amount” 
|... SMEHANMRUEE 
Number of 
Percentage Cases 
For the Isolationist Respondent 
Interviewers who feel U.S. is spending too much money . . . 19 31 
Interviewers who feel U.S. is spending the right amount. . . 20 60 
\ cea 
For the Interventionist Respondent 
MEE | eee 
Interviewers who feel U.S. is spending too much money... 61 3 
Interviewers who feel U.S. is spending the right amount. . . 78 60 


comparing the way interviewers of a given opinion code the replies of 
the two different respondents. In each of the two comparisons the ef- 
fect is to change the results by 40 to 50 percentage points. The relative 
importance of these two factors would, of course, vary from survey to 
survey depending on the intensity of the interviewer's ideology and the 
vividness of the attitude-structure of the respondent. In this instance, at 
least, the expectation effects are much more powerful. 

Another simultaneous test of the effect of ideology and expectation 
was made in the course of the experiment where the effect of attitude- 
structure expectations on coding was studied by imbedding responses 
in artificial contexts. The interviewer's ideology was determined by 
obtaining his own answer to the same question prior to the coding aS- 
signment. In so far as ideology had an effect, we would expect inter- 
viewers, contrasted in opinion, to differ in the way they coded the 
identical item when it was imbedded in a given context, By virtue 0 
the design of the experiment, one of the groups of interviewers had an 
opinion which was in conflict with the expectation created by the con- 
text, and the other group had an ideology which agreed with the con- 
text. The measure of the effect of ideology when it interacted with 4 
given expectation was to see whether or not the amount of shifting due 
to context was significantly reduced when the interviewer's ideology 
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sags а m to the expectation. The summary results for the 
None ame items studied are presented below in Table 26. 
dis dd. individual tests is significant, and the aggregate test is 
оша. 5 к cant. Ideology has по effect on the coding of these re- 
> he presence of an expectation created by context. Again, 


TABLE 26 
че Errect or Ї1ркогосу WHEN OPERATING IN Оррозгпох TO 
ice UDE-STRUCTURE EXPECTATIONS As MEASURED BY AMOUNT OF 
HIFT IN CODING FOR INTERVIEWERS CONTRASTED IN OPINIONS 


| Chi-Squared Value 
for Difference in | Degrees 


Experimental Item Shifting Between of P-Value 
Two Groups of Freedom 
Interviewers* 
21 1.22 1 .20-.30 
06 04 1 .80-.90 
01 33 1 .70-.80 
Aggregate test | 1.59 3 66 
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the 
result must be qualified in the light of the fact that the context Was 


is кзы: and powerful and probably created a strong expectation as 
theless eee in which the response was contained. зи 
analyses his test confirms the general findings of the jc rti o 
Compar made that ideological bias is only of secondary sign! cance as 
pared with expectational processes. 


CHAPTER IV 


Respondent Reaction in the Interview Situation 


Thus far, we have concentrated on research into the distorting ef- 
fects on interview data of processes operating within the interviewer. 
We have seen how the interviewer enters the situation with certain at- 
titudes and beliefs, which operate to affect his perception of the re- 
spondent, his judgment of the response, and other relevant aspects of his 
behavior. But this is only one side of a complex interaction. The re- 
spondent as well as the interviewer must entertain beliefs and attitudes 
which serve to affect the response he makes and which are—in part, at 
least—a product of the personal interview procedure. This chapter 15 
devoted to a theoretical formulation of the processes underlying such 
reactional effects and to illustrative empirical demonstrations. А number 
of the studies cited are from the earlier literature but are reconsidered 
in the light of a new conceptual framework. 

Certain respondent reactions are independent of anything the par- 
ticular interviewer might do, and are merely a function of the interper- 
sonal nature of the interview situation. They are the result of the in- 
volvement of the respondent in the interview situation. It is clear that a 
high degree of respondent involvement is a considered goal of survey 
agencies, for, by and large, the greater the involvement of the respond- 
ent in the situation, the grcater his motivation and interest in the task 
at hand. However, what seems to be crucial from the standpoint of bias 
is not the degree of involvement, but the mature of that involvement. 
The involvement of any respondent in an interview situation may be 
broken down into two major components—“task involvement" (i.e 
the involvement with the questions and answers) and what we will call 
“social involvement" (i.e., involvement with the interviewer as а per- 
sonality). While rapport may be a function of the degree of total in- 
volvement, validity may be conceived as increasing with task involve- 
ment rather than with the total involvement. To the extent that a re- 
spondent’s reaction derives from social or interpersonal involvement, 
ме may expect it to result in bias, since, under such conditions, the re- 
sponse will be primarily a function of the relation between the re- 
spondent and the interviewer, instead of a response to the task. 

Under what conditions is the social component of involvement in- 
creased? First of all, it is obvious that if we remove the “interviewer” 
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fro " : 
ei ыл em roan we decrease the possibility of respond- 
fared ene ith him as а personality. The case for self-adminis- 
аР ете em ае rests in part on this argument, It is frequently held 
Examination a ee effect if thereus no Interviewer, 
think of interviewe aa view, however, raises certain questions. If we 
legal! actual erto r effect as occurring in two different ways, one being 
or recording the FS introduced by the interviewer in asking questions 
respondent p s eer wd and the other being reactive effect upon the 
able to evaluat е visit le presence of the interviewer, we shall be better 
definition eat = view. True, the self-administered questionnaire, by 
absence of an ludes the former error; but the belief that the physical 
ent is mistak interviewer excludes a reactive effect upon the respond- 
We dak en. 
the pe id that subjects filling out questionnaire j 
the notion th ji ice of their replies." Thus, qualitative data support 
there is no n 16 may be present ап interviewer effect, even when 
may act as a cow Moreover, the very absence of an interviewer 
act as a check tasme factor. For in some respects the interviewe might 
Way th son tendencies among respondents to distort data in some 
Ale at will serve ego-needs. 
ias imei cas is clear that self-administered studies often contain some 
ple that the ARA social involvement, it may be stated as an initial princi- 
Mterviewer ees component of involvement will be increased sz the 
viously - oms larger in the psychological field of the respon lent. 
‚ We may expect that the respondent will be more sensitized 


to the << 
e Ё : 
S interviewer” when the latter is physically present. 
ming that in most cases the social component of involvement 
jewer, let us compare data 


will b 
s i H 
rom st arger in the presence of the interv: 
udies conducted by personal interview with those conducted by 
s may be operating as a 


self. 
] interview should be re- 


s take account of 


~administrati mee 
result m ico Whatever systematic bia 
Vealed b the greater interaction in the persona 
y such comparisons. 

l. SYSTEMATIC EFFECTS OF PERSONAL INTERACTION 
s of personal interview with 
ble. By comparing the mar- 
of the presence of the inter- 


erated by the characteristics 
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resuks fe of studies. comparing result 
8inals, we 2 self-administration are availa 
Viewer, irr an assess the systematic effects 
Of a given i of specific effects genet 

ib antecrieus tapeedi relationship. 
ollege stude ies, reported by Ellis, of the love re 
nts, answers from personal interviews O 


Jationships of female 
f sixty-nine stu- 
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dents were compared with those obtained by questionnaires filled out 
by the same students a year later.* The sixty questions were divided 
into three groups of twenty each, according to the degree to which 
“the ego would be involved” in answering the question; the judgment 
as to ego-involvement being made by a group of psychologists. Among 
the twenty most ego-involving questions, significant differences be- 
tween interview and questionnaire results at the 5 per cent level were 
obtained on six of the items; on the two groups of less and least ego- 
involving items, three out of twenty and one out of twenty differences, 
respectively, were significant. For example, on the question “How 
much did you love your mother during childhood?”, the distribution 
of responses was as follows: 


Interview Questionnaire 
Very dearly.. sss ee nene 37 25 
Agood deal... sss es simas 17 27 
Pretty much. ... 14 10 
Not too much. . 1 7 
одама! , oun ice xit 0 0 
М = 69 М = 69 


In general, the subjects exhibited less favorable (that is, less accepta- 
ble in our society) response patterns on the questionnaire than in the 
interview (fifty-five of the sixty items). The questionnaire produced 
more extreme admissions of traits which have unfavorable connotations 
in our society, such as jealousy, sadism, masochism, aggressiveness, and 
strong sexuality, and fewer extreme admissions of traits which have 
favorable connotations, such as forgiveness, happiness, sensitivity tO 
beauty, and kindness. Also, the questionnaire elicited more extreme ad- 
missions of traits connoting intense and “perhaps foolhardy” love. 
These were not confined to a few of the subjects interviewed. Of the 
sixty-nine subjects, fifty-three gave on the whole less favorable ques- 
tionnaire than interview responses, eight about the same, and only eight 
more favorable responses on the most ego-involving items, and the dis- 
tribution on the other items was very similar. 

Ellis concluded that in investigations of love and marital relationships 
among college students, the questionnaire technique may produce more 
self-revelatory data than the interview method. Similar findings were 
obtained in a later test with uncategorized responses. 

Since the interviewer in the Ellis study was a male, the findings con- 
ceivably could be accounted for by the sex difference between inter- 
viewer and respondent. However, Ellis refers to a study by Pointer, 
which yielded similar findings, even when the interviewer was a female. 
Pointer concluded that “the questionnaire is more reliable on the basis 
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of the larger number of admissions of sex practices among the (ques- 
tionnaire) group.” 

The design of the Ellis study was such as to render the results open 
to serious question. Since the questionnaires were unfortunately ad- 
ministered a year after the personal interviews, it is impossible to be sure 
thit differences are due to the method of inquiry. During the particular 
time of life when the students were being questioned, willingness to ex- 
рев attitudes on the subject of love relations might conceivably be un- 
dergoing fairly rapid change in the direction of greater freedom of 
expression and greater villingness to admit conventionally unacceptable 
traits, Then too, the experience of the individuals during that year 
E well have been such as to alter attitudes themselves. For these 
sy eg the data collected by Ellis, while suggestive, remain inconclu- 

СА 
она tending to confirm Ellis’ general findings is yielded by : 
Mich; nducted by the Survey Research Center of the University о 

рана Anonymous questionnaires, group administered, covering 
the attitudinal area of satisfaction with job and supervisor were ob- 
ед from workers in а utility company. Personal interviews with 328 
d brit T espondents were conducted ata later date, using two xem 
feet similar to the original wordings 1n the questionnaires — 
dene’ „For r easons of the research design. these ime " а od 
tionnai only with those respondents who had me a Кеч 
кк суы high or extremely lox mo oe ete en 
Spoken s might differ in the intensity of their Mis onum e 
qualifi ness, the generalizability of the results to à de 
Sets ка It should also be noted that the lapse of time bares x: d 
i measurements was approximately two months, creating thep 


ili à 
ty that any differences might reflect the systematic effect of real 


© в 
anges іп the work situation. " 

Yo of the results revealed a general tendency ae 
teresting to report less dissatisfaction in the personal ae jew, cod 
Sedans T is a refined analysis which showed that the "gus. н wi 
ie e had a differentially greater effect on “blue-collar w vm kem 
tion Fi d workers. These differential effects um eid 
бте at the anonymity of the self-administered questionnai - p ийе 
ets ater expression of unsanctioned attitudes, since the th ar w 

d general were found to be less satisfied with their work. — 
io other study affording a comparison between the personal inter- 
ew and the self-administered mail quest 


v 
onnaire was conducted for 
1те r 
magazine by Lazarsfeld and Franzen. 


5 A mail questionnaire was 
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sent to 3,000 Time subscribers and 1,052 were returned. Several weeks 
later, 1,387 of the original group of 3,000 were interviewed with the 
same questionnaire. Of those, 505 interviews were conducted with per- 
sons who had also replied by mail. For this group, both a completed in- 
terview and a mail questionnaire were available, enabling the results to 
be compared. The survey items covered a wide range of personal and 
family characteristics. 

Differences between the interview and mail answers were found to 
be significant at the 5 per cent level for eighteen of the sixty-six items 
covered. These items may be classified into four groups following the 
interpretations placed on the differences by the authors: 

1. A higher degree of education, heavier correspondence, and more 
time spent in magazine reading were reported in the personal inter- 
views. The authors’ interpretation is that “the answers obtained by 
mail are more qualified than the answers given to an interviewer.” In 
the case of magazine-reading time, they say “It is reasonable that the 
interview answer represents an outside guess while the mail answer 15 
more carefully weighed.” 

2. For total family income, price of refrigerator, price of washing 
machine, values in the upper brackets were reported more often in the 
mail questionnaire. The interpretation made here is that activity in the 
higher extremes is more readily admitted in the mail questionnaire. — 

3. Questions on “unusual types of activity." These include writing 
to newspapers, holding offices in clubs, etc. All these were more 
frequently given in the mail questionnaire. The authors' interpretation 
is that “in general, the unusual type of activity is more freely divulged 
in the mail response than in the interview." 

4. Number of magazines read. The number was much greater when 
reported by mail than when reported by personal interview. The au- 
thors say, “Probably the reason is that the mail query offers more time 
for consideration.” 

The report concludes that “answers obtained through a mail ques 
tionnaire are appreciably more informative and therefore more satis- 
factory than answers obtained by an interviewer. On many questions 
that involve a degree of activity, the mail answers are more qualified. 
On subjects dealing with buying power, mail questionnaires overcame 
a reluctance that is apparent in interview responses to reveal activity 1n 
the upper extremes, . . . and fewer people refused information on 1n- 
come." Further, “These findings substantiate several claims that аге 
usually made for mail answers: (a) bias that comes from the respond- 
ents’ desire to impress or conceal from the interviewer is eliminated; 


* 
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(b) answers to personal questions are more frequently given in an 
anonymous mail reply; (c) a mail reply is filled out in leisure and thus 
produces a more thoughtful answer." 

These conclusions, however, depend on the interpretation of the 
authors. In every case they explain differences in favor of the mail 
questionnaire, by classifying the contents of the questions in various 
Ways, after the fact. When more activity is reported by mail, the au- 
thors attribute this to “more time for consideration” or “activity in 
higher extremes more readily admitted by mail” or “unusual activity 
More freely divulged by mail”; but when more activity is reported from 
the interview, they say that the answers by mail are more qualified or 
that respondent's desire to impress the interviewer is eliminated. The 
alternative interpretation could be made that the presence of the inter- 
Viewer acts as a check on the veracity of the answers, in that it may 
make the respondents give a more conservative answer, that is, one 
that will not seem inconsistent with the circumstances known to the in- 
terviewer, 

Parenthetically, it should be remembered that we are dealing ма 
With those persons who replied by mail questionnaire. Although = 
Mterpretation that “answers to personal questions are more кее у 
Biven in an anonymous mail reply” may be correct for those who E 
teply by mail, there are many more people who do not reply at all i 
кз The minority who do take the trouble to answer by mail dile 

arcely be expected to leave many questions unanswered. Thus, W 
udy may provide additional evidence on the known fact that peo- 


i а i i iew, i not 
Ple Will not answer all personal questions in an interview, Jt does 


: i i rall 
ae the conclusion that the mail questionnaire can = € B 
gun UM for interviewing, since the majority do not гер y 


iie the data collected by Lazarsfeld and 
ef emselves to prove the conclusions of ше 
lend Tom our study of the pressures operating int 
Willi Support to the general notion t ропа 
Ng to reveal certain kinds of information In @ Р 
у oe comparison of mail question 
ite Ae Mal irector 0 t 
Quite different d by Lazarsfeld and Franzen.° ann 
s ay, June, and July of 1948, the Norwegian Gallup Poll аши = 
test on readers preference for particular articles 1n the 
“Bian edition of the Reader's Digest. 


Ormation on a random half of the sample households was ob- 
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tained by personal interview, on the other half by a card left for the 
respondent to mail. The results showed no clear-cut differences in the 
order of preference for the various articles between the two methods. 
Over the four-month period, the Spearman rank correlation coeffi- 
cients ranged from .78 to .84. Further, there were no significant differ- 
ences in preference for “serious” vs. “light” articles.’ 

A recent study by the Census Bureau compares the results obtained 
by interview and "self-enumeration." Under the latter method, a 
schedule is left to be filled out by the respondent and is picked up at a 
later date. 

The study was based on the October, 1948, pretest of census pro- 
cedures. In selected areas the two parallel procedures were used: inter- 
view and self-enumeration. Enumerator assignments were allocated to 
the two procedures by a random process. 

In order to determine the relative accuracy of the two procedures, а 
re-interview was made of a substantial proportion of the houscholds, 
employing a more detailed inquiry about selected topics. Whenever the 
original entry differed from the answer obtained on the check inter- 
view, the respondent was asked to explain the discrepancy. In this 
"quality check," the interviewers were professional personnel from the 
Washington office, so it may be reasonable to assume that the re-inter- 
view information is somewhat more accurate than the original data. 

In general, the results of the comparison were inconclusive. However, 
the check did indicate a possible superiority of the self-enumeration 
procedure in reducing the tendency to round off responses—i.e., the 
tendency to over-report highest school grade completed as eighth 
grade, twelfth grade, etc., or the tendency to over-report age at the 
convenient rounding-off points of forty, sixty-five, etc. Under the self- 
enumeration procedure, the respondent has a chance to check back or 
to look at records. 

In the case of education, the quality check resulted in changes for 
those reporting eighth grade, twelfth grade, and college completed by 
interview of 19 per cent, 12 per cent, and 32 per cent respectively, while 
the corresponding changes for the self-enumeration procedure were 17 
per cent, 6 per cent, and one per cent respectively. However, these data 
are based on only twenty-two interview cases and eighteen self-enumera- 
tion cases. Similarly, the check changed by one year or more 20 per 
cent of the individuals reported by interview as forty years old and 24 
per cent of those reported as sixty-five years old, while the correspond- 
ing percentages for self-enumeration were seventeen and twenty-two- 
Again the percentages are based on relatively few cases (between 
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tory findings indicate that such effects may, in certain situations, be 
insignificant. In other situations, while effects are evident, they are by 
no means uniform in direction. 

An experimental comparison of telephone vs. face-to-face interviews 
by Larsen bears on our earlier suggestion that one effect of the personal 
interaction of the normal interview may be to reduce prestige-moti- 
vated exaggeration by the respondent." While the telephone interview 
differs in important respects from the self-administered questionnaire, it 
approximates it in the sense of keeping the felt presence of the inter- 
viewer and interaction between him and the respondent to a minimum. 
In this sense, the findings have relevance to our analysis. 

Fairly comparable samples of individuals were queried by the two 
methods of interview about their behavior following the dropping of 
civil defense leaflets by aircraft over Salt Lake City. The leaflet was 1t 
the form of a postcard addressed to the authorities, and it encouraged 
the respondent to answer certain questions and to return the postcar 
by mail. In both samples, the proportion claiming that they had re- 
turned the postcard was identical, but comparison with the actual re- 
turns validated 80 per cent of the face-to-face and only 16 per cent of 
the telephone-mailing claims. Further, among the telephone respondents 
who reported exposure to the leaflet, 50 per cent could not report even 
one of the three things it told them to do, compared with 35 per cent 
of the face-to-face respondents. Similarly, 41 per cent of the telephone 
sample who reported exposure could not identify the officials who ha 
signed the leaflet, whereas only 32 per cent of the face-to-face respond- 
ents could not identify the signers. Other differences in knowledge were 
in the same direction. The claims made on certain other questions also 
seem less credible for the telephone sample. They report more fre- 
quently that they passed on the leaflets, told other people the message 
and inquired about the test drop. All of these differences in the direc- 
tion of inflated answers to questions of a prestigious nature were s0-t07 
speak inhibited in the presence of the interviewer. 

It is possible for systematic bias to arise from societal circumstances 
which commonly cause respondents to structure their perception a 
any interviewer in conformity with some preconception, without re- 
gard to the particular interviewer’s actual characteristics. Such tend- 
encies toward a uniform structuring of perceptions, if pervasive, can 
affect results in a systematic fashion, i.e., the entire body of data secure 
may be distorted in a particular direction. 

In a study of the effect of sponsorship, Crespi pointed out that data 
secured under the sponsorship of a fictitious German Opinion Institute 
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probably contained a measure of invalidity due simply to the fact that 
‘izable numbers of respondents feared that the interviewer might be 
an informer. That such perceptions are by no means unique or limited 
нч се cultural climates is revealed in data secured by NORC dur- 
A s period 1948-52, reported below. | 
naire oe because of the Wallace candidacy, NORC sent a question- 
О its interviewers inquiring about the freedom with which re- 

P ondents Were answering political questions. Although the findings 
Were in no way alarming, the number of spontaneous mentions of such 
ча наны fears by interviewers during the following year led E 
aman the questionnaire in 1950 and again in 1952. е sh 
Viewers a comments on this theme that were т A ч 
потепоп vie P ed op rs and аат 
to БЕШ, not imited to an isolated inte tun ка 
spond ar ocalities or types of respondents, and that, ао и 
woul ent perception of the interviewer would affect data, suc 

uld be diffused throughout the survey. Illustrative comments are 
Presented below: o ) 

rom а rural area outside Houston, Texas: 


Ti А 
ы Survey was harder because of everyone being fus hcc 
Just wol information to anyone asking any "wow д = Tong as the 
situati Шап t talk or answer if they could help it. I believe as. dem 

On is as it is, it will be hard to get true opinions on any пасопананизя 


never 
had so many refusals. 


Е, н 
rom San Die go, California: 


in Houston 


aid, "She's trying to find out 


One n 
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ошту € a Communist". . . One man a ghe sino Stone 
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Ton " 0 et Н 29s 
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ne " А ” 
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Е 
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Fr 

om : : o 
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Some 


: no Communist 
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From New York City: 
A good many people 
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refused to answer because they were afraid I was 


representing a Communist agency, and thought they would become involved 


in a disagreeable situation. 


The statistical comparisons of the 1948, 1950, and 1952 results of the 
questionnaire sent to interviewers point up the kind of systematic bias 


TABLE 27 


* 
TRENDS iN Interviewers’ Reports or ResPoNpENT FEAR AND SUSPICION 


Question 


CATEGORY 


PERCENTAGE OF INTERVIEWERS 


1948 1950 1952 
| WA j| d o LEE 
(М = 93) |N = 89) | (N = 97) 
en | ee eee 
“Did any of your respondents on 1 
this survey seem afraid to answer 41 41 4 
any of the questions?”.......... Yes 59 59 59 
No = = а 
100 100 100 
Mie 
(If Yes") “About how often did 
this happen?" ssi os esp aes eee eee Less than 1 in 10 36 14 19 
linlOtolin$5 33 53 32 
lin3tolin4 20 8 10 
More than 1 in 3 11 25 19 
a 
100 100 100 
eee 
“Did anyone refuse to continue 
with the interview after he once 
started it and heard some of the 13 31 33 
АЙШӘ sss is siio caste кайы Yes 87 69 67 
No — — rie 
100 100 100 
(If *Yes") “About how often did | Less than 1 in 10f 67 51 12 
this DIDDERA  . з ааа no cod in 1 3 49 
ppen 1 in 10 or more 3 ta | ше 
100 100 100 
“Did anyone doubt your state- 
ment of the sponsorship and pur- 
pose of the survey or suspect 3 
that the survey was being done 18 34 2 
for some hidden purpose?"......| Yes 82 66 17 
No 
100 100 100 


* The interviewer groups are not identical, since there were some changes in the staff during the period. 
{ Per cents on these questions are proportion of affirmative group rather than of total group. 
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which can dev: : 
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much higher in T of its occurrence among their respondents is 
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ata cited in unction of a given interviewer's characteristics, but the 
the foregoing indicate that а pervas i 


Which is inde ive suspicion exists 
Viewers, pendent of the appearance or manner of particular inter- 
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IFFERENTIAL EFFECTS OF PERSONAL INTERACTION 


In additi 
Te Bis to such systematic effects deriving from the interpersonal 
а source di us be clear that differential reactional effects are also 
quality, No 8 Each interview situation has а unique interpersonal 
^ respondent. a. interviewers can establish an identical relationship with 
atie Manner Peg ахеаву two respondents likely to react in exactly the 
Сап assu o a given interviewer. Where little interaction is present, 
918 that the interviewer does not occupy 4 large or well- 
> We P ede of the psychological field of the respondent, and 
паспе D e Expect: то find little evidence of reactional bias. Re- 
Ce of лен of social involvement in the situation precludes the pres- 
Wo of epe effect. 
€ cases described in Chapter I 
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tween involvement and bias. In the case of “The Creep,” we find an 
interviewer with potentially strong biasing tendencies, but a respondent 
with a high degree of involvement focused almost entirely on the task 
itself. His social involvement with the interviewer is almost nil, Conse- 
quently, we find little evidence of bias, although the total involvement 
may be presumed to be high. 

In another case, “The Tough Guy,” we also find little evidence of 
bias, but here there seems to be neither task nor social involvement. In 
conformity with our theory, these two cases graphically bear out the 
hypothesis that reactional effects are a function of social involvement 
rather than total involvement. In “The Creep,” task involvement was 
high and social involvement low, and little reactional bias was present, 
while in the “Tough Guy” we find both types of involvement low 
and likewise little evidence of bias. 

In contrast to these cases, in “The Hen Party,” a high degree of re- 
spondent involvement of both types existed. The respondent seemed 
most interested in the questions and also in a close psychological rela- 
tion with the interviewer. In this situation of “high rapport,” however, 
we find evidence of reactional bias. Despite the extent of the task in- 
volvement, the social involvement of the respondent was of such degree 
that reactional bias was clearly evident. 

These cases indicate the wide range of variation that can exist be- 
tween interview situations and the extent to which the nature and de- 
gree of reactional effects are a product of the interpersonal relationship 
between interviewer and respondent. 


3. SYSTEMATIC EFFECTS OF GROUP MEMBERSHIP DISPARITIES BETWEEN 
INTERVIEWERS AND RESPONDENTS 


In addition to the systematic effects noted earlier, there is putative 
evidence that the relatively homogeneous character of most interview- 
ing staffs also induces systematic reactional effects among respondents. 
It should be apparent that, quite apart from transient cultural condi- 
tions which bring about general respondent reactions of fear, there exist 
other conditions which are likely to produce a stable, well-structure 
perception of interviewers among many respondents. Since interviewers 
are a fairly homogeneous group, it seems logical to assume.that they 
will be perceived (and reacted to) in accordance with their homogene- 
ous characteristics. A study conducted by Sheatsley as part of this 
project presents convincing evidence of the special character of the 1m- 
terviewer population. Table 28 below summarizes some of the main 
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me vae а change their personality that respondents would be un- 
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counts for the validity of results is the feeling of mutual warmth and 
sociability, usually characterized by the term “rapport.” Thus it has 
been held that a disparity prevents the achievement of high rapport and 
in turn results in invalidity, and that a similarity permits high rapport 
and in turn yields valid results." This theory needs considerable quali- 
fication. While there is evidence of reactional effects where group 
membership disparities are great, this should not be construed as ге- 
sulting necessarily from lack of rapport. Our evidence indicates that 
the relationship between rapport and group membership is not of such 
a simple nature. 

In order to test the theory that similarity of group membership neces- 
sarily produces greater rapport, tabulations were made of reciprocal 
ratings of reactions to the interview secured from interviewers and 
respondents in a nationwide study. In this project, which was part of 
a larger study of the interview situation conducted by Marshall Brown 
in conjunction with NORC, respondents were handed "rating sheets 
by interviewers at the conclusion of the interview, in which they Were 
asked a number of questions about the interview and their reactions tO 
it. The interview itself dealt with issues of current political policy- At 
the option of the respondent, the rating forms could be mailed into 
the NORC office in a self-addressed envelope or returned to the inter- 
viewer, sealed or unsealed. In turn, interviewers recorded on a question- 
naire their ratings of respondent "honesty and frankness" and also the 
degree to which they themselves "enjoyed the interview." The rating 
scale used for enjoyment of the interview was identical on both re- 
spondent and interviewer forms. 

Assuming that rapport was highest where both interviewer and re- 
spondent enjoyed the interview, tabulations were then made of the de- 
gree to which this variable was a function of respondent-interview°! 
group membership similarity. The results are presented in Table 29 for 
the three group membership characteristics tested. 

If the assumption is warranted that ratings by respondent and inter- 
viewer of the extent to which they enjoyed the interview are a measure 
of rapport in the interview situation, it seems clear from the table below 
that rapport bears no necessary relation to group similarity. while 
among respondents of male interviewers there is evident such a relations 
the same cannot be said of respondents of female interviewers. Here 
rapport seems to be equally high with both male and female respond- 
ents. Likewise, if we examine the respondents of both the socio-€C0- 
nomically high and low interviewer groups, we find that rapport seems 
to be lowest in interviews with low socio-economic groups, regardless 
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views with middle-group respondents, while among older 


Tue Re TABLE 29 
LATIO 
ION or Group MEMBERSHIP SIMILARITY TO INTERVIEWER-RESPONDENT 
Rapport 
Prorortion ОР COMBINATIONS Wuere ENJOYMENT OF INTERVIEW 
РР ï Was RATED 
DENT-INTE! е 
Сомо улам ix By Interviewer Low Low High High 
| Mee. 
Number Percentage 
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Male interviewers 
pale respondents. .. ..... 98 3 ] 3] ш | 31 
е respondents...... 91 43 23 16 18 
Е s 
po Interviewers 
€ respondents 23 17 34 
а дил onse 476 26 
8 ale respondents... ... 512 29 24 14 5 
Ocio-F 
ied ботс Status 
“an 32 
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5s pondents,........ 378 24 26 17 a 
Spondents. .......... 179 36 32 s 
Age se 
Intery; 
Tvlewers 
under 30 
ез] 23 
oo under 30. ... 55 42 п M 36 
uie. 30-39... „ха 47 32 22 $ 24 
ү ndents 40 and over 90 39 3 
Dteryi 
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Pondents 30- 27 
spond ЙҮ 33 27 1 14 27 
ents 40 and over 78 28 3 
Ntery; 
Pe ts 40 and over jt 16 34 
sponda under 30. . 167 23 17 22 31 
PO pane TIERE 224 30 23 7 36 
8 ents 40 and over. . 431 24 
inte 
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One might argue that whether the interviewer enjoyed the interview 
is immaterial, and that the rapport measure should only be based on к. 
spondent ratings of enjoyment. If we approach the problem with t 
criterion and examine the sum of the percentages in the second ап! 
fourth columns in the table (which measure respondent enjoyment 
alone), we find that group similarity is related to rapport only in the 
case of male interviewers. All other combinations fail to reveal any 
direct relationship. А T" 

It is entirely plausible, however, that at particular levels of in : 
viewer competence, group similarity may produce greater rappo х 
Where interviewers аге less competent ог less experienced, it € 
likely that group membership similarities might substantially assist pos 
interviewer in maintaining rapport.” This explanation is suggested y 
the preceding table; by and large, NORC's women interviewers en 
more competent and more experienced than the men interviewers, = 
the older interviewers are at least more experienced than the young 
ones.” For women and older interviewers, as may be noted in the fore 
going, the group membership character of their respondents seems tO 
make little difference in ratings of enjoyment, either when measure 
separately for respondents or when both ratings are compounded. 

Granted that rapport is not a simple function of group members (a 
similarity, as has been previously accepted, the theory can also be qua T 
fied with respect to the principle that validity necessarily increases € 
an increase in similarity. The interviewer's rating of respondent He 
ness and honesty," alluded to earlier, may be used as an inferenua’ 
measure of response validity. While, of course, we have no basis for = 
suming that the interviewers’ reports have any absolute validity, 
seems reasonable to assume that whatever invalidity they contain ! 
randomly distributed among respondent subgroups. The tabulation 9 
these interviewer reports in their relation to group membership 51!" 
larity is presented in Table 30. 

If we compare the results in Table 30 with those in Table 29, we 
find a high correspondence. Again, it is only among male interviewer 
that group membership similarity is a factor in validity ratings. Also, 4 
in the previous table, we find the lower socio-economic groups ge 
as less honest among both groups of interviewers. Age differences ar 
small and inconclusive. | 

While there is no evidence here of any relationship of validity t° 
group membership similarity, it would seem, from the preceding tables 
taken together, that there is a direct relationship between validity ап 
rapport.^ However, neither of these variables has any general relation 
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to the similarity or difference in the group membership character of re- 
Spondents and interviewers. Even among specific groups, it may well 
be a factor other than group membership similarity (e.g., the experience 
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view. In this instance, a direct measure of response reliability may be 
used as the criterion of quality. In the study just described, one question 
asked earlier in the interview was repeated in written form at the end 
of the respondent rating sheet, which (it will be recalled) the respond- 
ent filled out after the conclusion of the interview. It was possible to 
isolate the respondents who changed their answer the second time the 
question was asked, and to compare the characteristics of the reliable 
and unreliable groups.?^ Here it was found that reliable respondents, 
when asked to select from among the list of phrases the one that best 
described the interview, were more likely than unreliable respondents 
to report that the interview was “like a friendly discussion.” | 

It would seem, then, that rapport and group membership similarity 
must be viewed as separate operating factors within an interview situa- 
tion. True, in many situations the two factors coincide, and there 15 
some evidence that under defined conditions similarity may be one of 
the factors that induce rapport. But that there is no organic or necessary 
relation between these factors seems established from the data pre- 
viously presented. It must be that particular other types of affect oC 
curring in specialized instances of disparity are the explanatory princi- 
ple for the observed effects of group membership disparities. In certain 
such instances, pressures generated as a result of emotions of fear, dis- 
trust, or misunderstanding operate. And because the deviant or minority 
individual is likely to have a different opinion in the first place, these 
fears will operate to alter his opinion in the direction of conformity: 
That this seems more tenable than the notion of rapport as an explana- 
tion is also clear from the statistical findings to be presented in the next 
section. If the factor of rapport were explanatory, results should show 
a diffuse effect over many questions. This is clearly not the case. The 
group membership disparities locate their effects only on specific ques- 
tions—ones where fear and distrust would operate to control the an- 
swer given. : 

In the next pages, we present evidence of differential effects arising 
from group membership differences between interviewers and re- 
spondents. In many of the studies cited, there is no clear proof that the 
effects noted are not due to processes operating within the interviewe!s 
(such as noted in Chapter III). However, the consistency of effects, as 
well as the fact that they occur on questions in which respondent reac- 
tions would be hypothesized by logic, lends support to our belief that 
the data to follow do, in fact, represent effects arising primarily from 
processes within the respondent rather than within the interviewer. It 
is possible, however, that both types of processes occurred. 
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Р, ае н differences in color.—We have clear avidence 
бтегсоте. the relit od neared of the interview situation does not 
to whites. In a se зе of Negroes to express their opinions freely 
ple of 1000 Ne = у conducted by NORC in 1942 in Memphis, a sam- 
handled by Ne groes were interviewed with approximately 500 cases 
were анам A ea ae Cem and 500 by whites.” The two samples 
white and il at is, the assignments were randomized as between 
opinions and Ws interviewers. The survey questions dealt with 
questions of a IÍ es about the war, but there were also a number of 
Viewers dede a nature. Table 31, below, shows that white inter- 
Viewers on most cee | different results from the Negro inter- 
and attitude of the individual questions. On almost all the opinion 
higher pito А. ани, the white interviewers obtained significantly 
“acceptable” rtions of what might be called by some people “proper” or 
White isses d. Negroes were more reluctant to express to the 
or labor unio wers their resentments over discrimination by employers 
belief inthe ns, in the army, and in public places; to express any sort of 

Jermany; к intentions or even possibility of victory of Japan or 
Sibly out a Jos to white. interviewers sympathy for the CIO (pos- 
Cal). Even о ear that the white interviewer might think them too radi- 
reading of Y some of the factual questions such as auto ownership, 

Vegroes ге m © newspapers, and CIO membership, apparently some 
Viewers, Т ported differently to white interviewers than to Negro inter- 

t must be remembered that the survey Was carrie 
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where fear of the dominant whites 15 greatest. 


TABLE 31 


CLASSIFICATION or Questions AskED or Necro RESPONDENTS BY THE DEGREE OF 
SIGNIFICANCE OF DIFFERENCE IN Answers To Necro лхо WHITE INTERVIEWERS IN 
Memputs, TENNESSEE (1942) 


PERCENTAGE OF NEGROES 
Стуіхс ANSWER 
Question CATEGORY TESTED ШИШИ ee 
Negro White 
Interviewers | Interviewers 
Difference Between Responses to Negro and White Interviewers Significant at 001. Level 
N = about | N = about 
500 500 
Is enough being done in your neighbor- 
hood to protect the people in case of 
METNAR sanan sas Sere SER QUA REX OM Yes 21 40 
Do you think this country will win the 
TT MMOL Yes 59 79 
If we win, do you think the Negroes 
will be treated better, worse, or the 
rur аараан нор ала АЛЕ EIS Better 34 44 
Would Negroes be treated better or 
worse if Japan conquered the U.S.A.? Worse 25 45 
Would Negroes be treated better or 
worse if Germany conquered the 
USAP sisi ase “ТОЛЫ ы ане Worse 45 60 
Is the army fair to Negroes now?..... No 35 11 
Is the navy fair to Negroes now?..... No 23 П 
Have Negroes, right now, as good a 
chance as whites to get defense jobs? . . Yes 39 52 
Who is most to blame for this? (Asked 
of those answering "No" above) ..... Government 8 2 
Are labor unions fair or unfair to 
Negroests ox oun sow guten see gu Fair 30 47 
Is it important to concentrate on win- 
ning the war or on democracy at 
BEG ich sine ceni Rute ...| Winning the war 39 62 
Who would a Negro go to to get his (White people?) 16 B 
COT T MN UP MIS FM (Police?) 2 15 
(Law courts?) 3 12 
(Nobody?) 26 13 


TABLE 31 (Continued) 


PERCENTAGE OF NEGROES 


GIVING ANSWER 
INDICATED ТО: 
Question Catecory TESTED J 
Nesta White 
Interviewers | Interviewers 
Ww 
i Negro newspaper do you 
ae serenus None 33 n 
Wh М | 
tion you think should lead Negro 
нннеее Negro officers 43 22 
Difi 5 igni 
erences Between Responses to Negro and White Interviewers Not Significant st 001 Level but Significan 
at .01 Level 
Do Р 
be dos ыш Negroes are better off А 
what wa om than before the war (in Less economic 8 
BOR nue antt ete реа лене discrimination 21 : 
Whi 
ch P 
now?. do Negroes feel worst about (Housing?) , В af 
EE Жаза EE (Discrimination 1n 4 
public places?) 3 
Doe: 
5 anyone і Y 
1 т you j 
automobile? your family own an xis 20 13 


Diff 
‘erences Between Responses to Negr 


o and White Interviewers Not Significant at 01. Lecel 


About how a 


& w much longer d. ink 
War willlast?,. ey iiis Less than one усаг 28 и 
"iion ie dedi 
Bü dn: 
ndn Negrocs are better off 8 и 
Ww than before the war?..... Better off i 
hich | ie 
Now? do Negroes feel worst about | (Job discrimination?) 5 Ы 
Чыр е (Wage?) |” 
ave Ne, 
а chan Broes Tight now just as good | 
(Whois "3 Whites to get е Jub (Managers?) " А 
Ost to blame for this?)..... (Labor unions?) 
ich is fai 
AF RM (to Negroes) CIO or CIO* 36 Е 
aboy, ЧО you 
Ut the van ЕА — of your news Talking to people* 13 9 
li at radi " 
Îsten to> E Station do you usually WREC* i: е 
О Кш 
Pleted 125 the high 
dat schools | ма you com- High school or better* 19 14 


iffere 
Seton 
ce significant at .05 level 
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Additional evidence on the effect of color is available in the work of 
the War Department Research Group. Stouffer reports the following 
findings from a comparison of responses of Negro troops to Negro vs. 
white interviewers:?* 


TABLE 32 


Responses sy Necro Enuistep MEN rrom AGCT Crass IV ix INTERVIEWS BY NEGROES 
AS CoMPARED WITH INTERVIEWS BY WHITES 


Excess in Percentage of Tu 
Кена e oer es Com: 

pared with White Interviewers 
Indicatingizacial protege. cus 2225312 eet quo ssec ere re tke plus 21 
Indicating low personal commitment plus 14 
Indicating lack of enthusiasm for war aims... . plus 8 
Indicating pessimism about postwar conditions plus 21 
Indicating unfair treatment inthe агту................... plus 16 
Indicating lack of high regard for officers and “попсотѕ”..... plus 2 
Indicating relatively low personal esprit or job satisfaction in 

Шенне ома Lap aes exa qoin Sav a SEEDS tn plus 8 


Effects arising from differences in etbnic group.—Differences of re- 
ligion, creed, or nationality between interviewer and respondent may 
also produce distortion of results. We have several studies which give 
evidence that non-Jewish people with anti-Semitic prejudices will eX 
press these more readily to gentile than to Jewish interviewers. In ? 
1943 NORC survey, this question was asked: “Do you think that J ewish 
people in the United States have too much influence in the business 
world, not enough influence, or about the amount of influence they 
should have?” 

All interviewers in New York City received equivalent assignments 
on this study so that a valid comparison of the answers given Jew! 
and gentile interviewers can be made as in the table below. 


TABLE 33 
Е 
Comparison оғ Answers or Non-Jewish RESPONDENTS TO JewisH AND Смт, 
INTERVIEWERS 
Too Much | Not Enough | Amount They | Don’t N 
Influence Influence Should Have Know 


Percentage of Gentiles inter- 

viewed by Gentiles. ....... 50 2 38 10 139 
Percentage of Gentiles inter- 

viewed by Jews........... 22 8 58 12 88 
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A Chi-squ scis 
would еы. —— indicates that differences as large as those shown 
Although thes T by chance less than one per cent of the time 
Gentiles dar on gures show striking differences in the responses of 
is somewhat inco v by Gentiles rather than by Jews, this findin, 
vey and thus the Re usive because quota-sampling was used on this ue 
Viewer selection i ects might have resulted, in part at least, from inter- 
ish interviewers Е respondents to fill his quotas. If, for example, Jew- 
more friendly t selected within their quotas gentile respondents who are 
The i o Jews, the effects noted could have taken place. 
of the effect oe trolled studies of Robinson and Rohde present evidence 
enable us to id es up membership disparity on respondent reaction and 
Structurin ОР de he theory advanced earlier concerning the relation of 
iewer groups w interviewer image to reactional effects.” Four inter- 
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presented for the two questions which constituted the original experi- 
ment. 

It will be noted, first of all, that the frequency of anti-Semitic re- 
sponses on both questions is greatest where the interviewer does not 
appear to be Jewish.?? As the Jewish identification increases, we find a 
decrease in the frequency of anti-Semitic responses, so that where an in- 
terviewer both “looks Jewish” and uses a Jewish name we get the low- 
est frequency. The order of regression is identical for both questions, 
and the relation between the degree of structuring and respondent re- 
action seems clearly established. 

Effects arising from differences in sex.—Some highly suggestive evi- 
dence that respondents tend in some cases to tailor their opinions ina 
manner to conform to the opinions or tastes of the sex of the inter- 
viewer is furnished by two sets of data. The first of these comes from 
the “story tests” on movies conducted by the Audience Research In- 
stitute in 1940." This technique consists of handing cards to test sub- 
jects on which is written a summary in about fifty words ofa projecte 
movie story. The subject is asked to indicate whether or not he wou 
like to see the picture. The analysts, surmising that respondents' feelings 
about new movies on which they have very little information (only 
a three- or four-line description is given the respondent) are generally 
so mild that many things might operate to influence their choice, de- 
cided to do a study on whether sex of interviewer alone affects decisions 
to any great extent. They suggest, for example, that when a man has 
movie tastes which are fairly indefinite he is likely to say that he favors 
a movie which he believes might appeal to the members of the inter- 
viewer's own sex group. Table 35 below presents detailed results of 
the analysis. 

The effect of the interviewer's sex can be tested by comparing the 
differences between men and women respondents when interviewed by 
their own sex with the differences between men and women respond- 
ents when interviewed by members of the opposite sex. Take as an ex- 
ample the picture, “They Knew What They Wanted.” For male re- 
spondents interviewed by males, the per cent favorable was ten as 
against eighteen per cent for female respondents interviewed by females 
—a difference of eight per cent. But both males interviewed by women 
and females interviewed by men showed the same percentage favorable 
—14 per cent. In other words, sex differences among the respondents 
were small when interviewed by the opposite sex, large when inter- 
viewed by their own sex. 

In eleven of the twelve tests, the results for male and female respond- 
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criminals; they should be publicly whipped or worse.” “No decent man 
can respect a woman who has sex relations before marriage.” 

The sample was broken into four groups depending on the sex of the 
respondent and interviewer, and comparisons of the results obtained in 
these four groups were made. 


TABLE 36 
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Chi-squared tests were made to determine the significance of the dif- 
ference between the obtained distributions of results for respondents o 
a given sex when the sex of the interviewer was varied. Only one test 
was significant at the one per cent level. This was in the case of women 
respondents on the “sex criminal question.” All three other differences 
were not significant. However, the number of cases is too small to show 
up anything but very large differences, and mere inspection of the table 
reveals consistencies which are suggestive of certain effects. It is note 
worthy that the women respondents in the case of both questions ex- 
pressed the harsher or more puritanical (or perhaps merely more con- 
servative) attitude to both male and female interviewers than did the 
male respondents. On the other hand, both women and men respondents 
expressed this attitude more frequently to men than to women inter- 
viewers. 

These results were derived from a random sample of households clus- 
tered by blocks, so that any interviewer effects could not have arisen 
from selection of particular respondents by different interviewers. It i5 
true that the assignments of interviewers were not matched, but e€m- 
pirical data on the population characteristics of the samples interview 
by men vs. women show no great differences. 
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Gallup interviewers were closer to working-class interviewers in results 
than were inexperienced white-collar interviewers, their findings still 
differ significantly from working-class interviewers. 

Katz suggests that this phenomenon may account for the well-known 
tendency of the polls to under-predict the Democratic vote and sug- 
gests employing more working-class interviewers or better training of 
white-collar interviewers. He also makes the important point that the 
bias, if real, should be large in some cases, negligible in others, depend- 
ing on the subject matter. : 

Conccivably the difference in results may be due to differences in the 
ideology or expectations of the two groups of interviewers, rather than 
to the reactions of the respondents. The opinions of the interviewers 
themselves were obtained; they revealed that the working-class inter- 
viewers were more radical and isolationist than the middle-class m 
terviewers. However, Katz attributes the differences to “better rapport 
obtained by the working-class interviewers, suggesting that they were 
more easily able to get at the true attitudes, because the working-class 
respondents, especially those with strong pro-labor views, would talk 
more freely to members of their own class. As evidence of the greater 
validity of responses obtained by working-class interviewers, he cites 
the fact that they report more verbatim comments, and that the results 
they obtain correspond most closely to those secured by experienced in- 
terviewers. 

Effects arising from differences in residence.—Data to compare the 
validity of responses obtained by strangers or nonlocal interviewers 
with those obtained by local interviewers are almost nonexistent. Опе 
apparent advantage in favor of the stranger interviewer lies in his pe 
nymity, reinforcing the impersonality of the interview situation, n 
providing reassurance to the respondent that his answers will not be 
bruited about the neighborhood. For example, one of the technical criti- 
cisms of Kinsey's interviewing method referred to his procedure о 
building up patterns of intimacy with the potential respondent prior to 
the actual interview.” The psychoanalyst is the repository of our most 
sacred thoughts partly because he is a “stranger.” The sociologists have 
built an elaborate theory supporting this notion.” Except in times 0 
war and spy hysteria, when he might be regarded suspiciously, the 
stranger interviewer has the advantage. An example of the latter type 9 
situation is furnished by a 1943 OWI survey dealing with security 9 
information.” . 

The survey was made in the cultural setting of a small town during 
the war, and during the worst period of spy scares. Five local interview- 
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the atmosphere in which the study was conducted gave to Negro-white 
relationships their strongly affective character. 

This hypothesis is supported by the comparison of the data secured in 
Memphis, with a replication conducted in New York City. Because of 
the difference in the cultural norms surrounding Negro-white relation- 
ships in New York, one would expect that the reactions of Negro re- 
spondents to the group membership of the interviewer would be less 
strongly manifested in the data. We will not repeat for New York City 
the detailed data given for Memphis, but instead we present in Table 38 
below a comparison of differences between results obtained by white 
and Negro interviewers in the two cities, showing how many questions 
yielded differences at each level of significance. The comparison is 
based on the eighteen opinion questions and the three questions of a 
factual nature which were common to both surveys. 
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When we compare the individual T-value or standardized differences, 
that is, each difference divided by its standard error, we find eighteen 
questions on which the standardized differences are higher in Memphis 
than in New York, and only two for which the differences were lower. 
The probability of obtaining eighteen or more higher differences out of 
twenty-one is about one in 10,000. Thus there is scarcely any doubt 
that Memphis Negroes are more reluctant to talk freely to white inter- 
viewers than are New York Negroes. 
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1. NATURE OF SITUATIONAL DETERMINANTS 
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interviewers refrain from taking notes within the interview and record 
the answers at a later time from memory, we might introduce bias into 
every interview, because motivational or autistic factors might affect 
the memory processes of every interviewer. 

The establishment of standardized interview procedures attests the 
importance of situational determinants of interviewer effects, whether 
these effects are regarded as varied among interviewers or common to 
an entire field staff. The precepts given the interviewer in the course of 
training, or as instructions attached to a particular survey, are so de- 
signed as to produce a particular uniform role, or pattern of interview 
ing behavior, and so to reduce variability among interviewers as well as 
any undesirable behavior originally characteristic of all interviewers. 
But such role prescriptions are not always effective and, even when 
they are, situational determinants of interviewer effect can be of im- 
portance. The qualitative evidence presented in Chapter II demonstrates 
dramatically how pressures generated by the situation force the break- 
down of the prescribed role. 


2. TESTS OF THE OPERATION OF THE TOTAL COMPLEX 
OF SITUATIONAL DETERMINANTS 


Quantitative evidence can be presented to demonstrate that inter- 
viewer effects are mediated by the situation. If the occurrence of effects 
did derive from a particular enduring set of propensities in the inter- 
viewer, independent of the specific situational field in which that inter- 
viewer is operating, one would expect interviewers to manifest thelr 
effects completely consistently over a variety of circumstances. If, how- 
ever, effects over a variety of situations are thoroughly inconsistent, one 
must conclude either that only temporary internal processes аге 1n- 
volved, or that the persistent biasing tendencies are activated essentially 
by situational determinants. In-between the limits of consistency vs. 10- 
consistency of behavior across situations, one obtains an expression of 
the relative contributions of enduring personal and situational determ!- 
nants to actual interviewer effects. 

Five such demonstrations of degree of consistency of the same inter- 
viewers observed repeatedly in different situations are available and are 
presented below. These demonstrations have in common the property 
that the specific changes in the situational field from observation to 0b- 
servation are not susceptible to precise analysis. To isolate the effect of 
a specific single situational determinant calls for the type of experimen- 
tal approach to be presented in Section 4 of this chapter. These demon- 
strations, however, have the unique virtue of revealing the effect of the 
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The wide variation in stability of results for the different interview- 
ers suggests that the interaction of situational and personal factors 
which determines interviewer effect is in turn a function of some other 
personal determinant within the interviewer. There may well be some 
more basic psychological process differentiating humans who are sensi- 
tized to changing situational fields from other humans who are less re- 
sponsive to external events. 

While we have no tests or evidence bearing explicitly on this aspect 
of our theory, recent experimental research in perception gives strong 
support to such a typology. Witkin has recently demonstrated that 
there is considerable consistency in the way in which the individual re- 
sponds to a series of perceptual tasks.* These tasks were so constructed 
that they yielded a measure of the degree to which the person used his 
own internal postural experiences rather than aspects of the external 
visual field in the process of perception. Witkin found that some indi- 
viduals are markedly “field dependent,” or oriented to the external 
aspects of the situation, whereas other individuals tend consistently 
to be “independent of the field,” and that there is another group of 
individuals who are persistently unstable with respect to their sensitivity 
to the field. He remarks: “It is quite clear that a tendency to rely mainly 
on the visual framework or to remain independent of the field through 
awareness of bodily experiences represents a fairly general character- 
istic of individual orientation." 


TABLE 39 
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; Other demonstrations are available to show the inconsistency of the 
interviewer's biasing potentialities within the same unit interview. Such 
demonstrations suggest that situational determinants of a most transient 
Sort interrupt the biasing processes. 

In the course of re-interviewing a panel of respondents in the 1948 
Political study in Elmira,® the interviewer who conducted the first in- 
terview with a given respondent in June was generally not assigned to 
the re-interview conducted in October. Comparison of the answers ob- 
tained by the different interviewers from the same respondents shows 
that many respondents changed their attitudes. While most. of this 
change must reflect processes within the respondent, some portion ofit 
presumably derives from the particular interviewer who asked the ques- 
tions. The variation in the amount of change among the samples of dif- 
ferent interviewers is so large that this assumption seems warranted. For 
example, on a question on attitudes toward labor unions, the proportion 
> Téspondents changing ranged from a minimum of 20 per cent for one 
Interviewer to a maximum of 69 per cent for another interviewer. On 
another question dealing with expectation of war, the proportion of re- 
Spondents who changed their opinions varied among the different inter- 
Viewers from 22 per cent to 78 per cent. On a third question dealing 


With the locus of blame for the Jewish-Arab conflict in Palestine, the 
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the source of such effects were purely within the interviewer Hie 
Sh One would expect the interviewer who had many changers on z^ 
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Another demonstration of the partial inconsistency of interviewer 
biasing tendencies is available in the realm of probing behavior while 
asking open-ended questions. The tendency of particular interviewers 
all dealing with equivalent respondents to obtain many or few multiple 
answers to cach of four open questions contained within the same ques- 
tionnaire was determined. It was found that interviewers differed sig- 
nificantly in this tendency. These differences could not be allocated to 
intrinsic differences in respondents because of the design of the samples 
and must therefore represent interviewer effects.” 

While the four questions covered different content areas, the formal 
task of probing was the same in each instance. The influence of situa- 
tional determinants on the consistency of the interviewer’s effect can be 
demonstrated by computing the rank-order correlations for the amount 
of multiple answers he obtained on pairs of questions. The median value 
for rho was .48, demonstrating that, while there is considerable consist- 
ency of effect in the realm of probing due to intra-personal factors, 1¢ 
is in part disrupted by situational factors. 

Two other demonstrations of the intrusion of situational determinants 
are available. In these studies, wire recordings were made of the actual 
interviews, and by comparing the written interview with the wire re- 
cording, the number and type of errors made by each interviewer can 
easily be judged and scored. In both studies, the comparisons made are 
limited to interviews conducted by a number of interviewers interview- 
ing the same respondent, who was prepared in advance with a “set 0 
attitudes" (and in one study, a set of actual answers to be given). 

Comparison of errors made in different parts of the interview enables 
us to estimate the effect of contrasted transient situational elements. 
While the temporal process carries with it elements of practice and fa- 
tigue, plus an opportunity for the reorganization of perception and sen- 
timent as a result of the on-going interaction, it, of necessity, exposes 
the interviewer to new types of tasks as new subject matters are touched 
on and new forms of inquiry are used. In general, our hypothesis would 
hold that where the parts of the interview compared are situationally 
similar we would find greater consistency in the error scores for a given 
interviewer; where there is situational dissimilarity, such consistency 
should be less in evidence. 

In the early study by Lester Guest, interviewers’ total error scores 
were computed, and the relative accuracy of interviewers as between 
the first and second halves of the interview was compared. The data 
from the Guest study are presented in Table 41.7 . 

The influence of situational factors is best summarized by ranking 
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and second thirds of the interview the rank order correlation was 75, 
between the second and third portions, .74, and between the first and 
third, .51. While the correlations are considerably less than unity, it is 
clear that in this experiment there was much greater interviewer con- 
sistency than in the earlier study. Of course, any number of factors 
might have played a part in the differences obtained, but it seems likely 
that two specific considerations are involved: 

1. The uniformity of the role played by the planted respondent in 
the AJC study. While Guest’s respondent gave rather “typical” re- 
sponses and played the role of a normal respondent, the AJC respondent 
used in the present comparisons persistently adopted the role of a tough, 
recalcitrant lower-class individual in the interview. In the face of such 
uniform personal behavior on the part of the respondent, it seems quite 
logical that transient situational elements would play a smaller role in 
affecting interviewer error. 

2. The greater similarity of question types. In the AJC study, twenty 
of the thirty-three attitude questions were of the agree-disagree type 
and these were located in all three parts of the interview. Therefore, if, 
instead of dividing up the interview into temporal units, we select other 
criteria for division, the importance of given situational factors, as well 
as the relation between interviewer consistency and situational similar- 
ity, can be convincingly demonstrated. Likewise, the differences be- 
tween the findings of Guest and the AJC may be more clearly under- 
stood. 

The questionnaire with which the AJC experiment was performed 
consisted mainly of questions in four areas: (1) attitudes toward Ne- 
groes and toward discrimination against Negroes, (2) attitudes towar d 
Jews and toward discrimination against Jews, (3) so-called *authori- 
tarian” attitudes selected from the Berkeley scale, and (4) factua 
questions about the respondent. With the exception of the factual data, 
the questions on the various areas were equally distributed throughout 
the questionnaire. The factual data were less well scattered, а few ques- 
tions being asked in the beginning and the majority at the end. The 
Negro and Jewish questions were the same in content and formal struc 
ture. The authoritarian questions were similar to the Negro and Jewish 
questions, with one important difference—they contained no “card- 
type” questions. The factual questions were necessarily quite different 
in form from any of the others. . 

In Table 42, we present the rank-order correlations among the nine 
interviewers for errors made on the various content arcas. 


It may be seen from the following table that, while correlations are 
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Positive, they are far from 1.00, indicating that there is considerable 
inconsistency in the tendency of interviewers to make errors on ques- 
tions even within closely related areas. More revealing, however, is the 
variation in the correlation co-efficients across different areas. Where 
the content is similar and the form identical (the Negro and Jewish 
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within our power to manipulate them and thus to reduce bias. The 
biasing tendencies among interviewers and respondents would still exist 
but would operate minimally because of the nature of the interview 
situation we provide. 

The history of industrial psychology will demonstrate by analogy 
such an approach to the treatment of error. In the adjustment of the 
worker to the machine, psychologists initially developed selection and 
classification tests to find those individuals who would perform most 
effectively within the industrial situation. The machine was taken as a 
"given," and the “errors” were located within the individual and con- 
trolled by a system of selection. However, the more recent development 
of “psycho-engineering” reversed the procedure." The limitations of 
the human were regarded as a given, and the problem was scen as that 
of re-designing the machine in such fashion that human capabilities were 
not overly strained. 

Of course, this analogy should not be strained. Designing the survey 
in terms of the limitations of current interviewing staffs would lead to 
gains in the control of error; but in the long run, such a policy would 
freeze current research practice at a relatively low level. What would 
seem to be indicated is an approach to the problem of interviewer effect, 
both directly through interviewer selection and training and indirectly 
through control of situational factors eliciting or facilitating the biasing 
tendencies of interviewers and respondents. 


3. PAST LITERATURE ON SITUATIONAL FACTORS AS A 
GUIDE TO REFINEMENT IN THEORY AND RESEARCH 


While the analyses just presented establish beyond doubt the influence 
of situational determinants upon the operation of interviewer effects, 
they contribute little to an understanding of the nature of such influ- 
ences. The complex of situational factors must be analyzed; experimen- 
tal studies of specific factors must then be conducted, and a theory must 
be developed which will aid us in constructing situations which are not 
likely to engender the biasing processes within the interviewer. Rather 
than embark on an endless project in which every single segment of the 
total situation is subjected to experimental study, or attempt to construct 
a theory of situational determinants out of thin air, we shall first turn 
to some past research into situational factors to help clarify the problem 
and give leads to our own research. . 

While we can find little evidence in past literature as to the way 19 
which interviewer effect is mediated by the situation, there is a consiC- 
erable body of literature from self-administered questionnaire studies 
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showing the effect of situational factors on the results obtained. For 
example, the situational factors of question form and anonymity have 
been subject to massive past research.” While these studies, by defini- 
tion, provide no evidence on the interviewer's behavior, they are most 
relevant to our problem. They have the virtue of suggesting that, in 
part, the situational factor present ina personal interview survey may 
have an indirect effect on the interviewer. Since the self-administered 
Studies show that respondents’ replies can be changed by altering a situ- 
anona] factor, they suggest that, when an interviewer operates within a 
particular situation, regardless of what he himself may do, he may meet 
one kind of reply rather than another. In turn, his effect on the data 
ыш occur during the processes of coding, judging, recording, or 
Probing the response rather than in the initial asking of the question. 
Consequently, in our specific theory of situational determinants, we are 
ed again to stress alterations in interviewing tasks at the later stages of 
bie rather than the influence of given situations on the opportunities 
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effects, and clarification of processes normally lumped under a given 
situational heading is needed before one can undertake meaningful 
research on such a factor. To illustrate the complexity of situational 
variables and as a guide to such clarification we shall consider the prob- 
lem and the literature in two traditional research areas—the effect of 
situations varying in respondent anonymity and the effect of situations 
where sponsorship is altered. 

Anonymity.—Mere consideration of the situational factors that relate 
to respondent anonymity in the usual personal interview survey reveals 
that the Jiteral fact of anonymity provides no necessary psychological 
anonymity. | 

Although names are usually not taken, virtually all surveys require 
addresses of the respondents. But even where no addresses are taken, 
there still exists no psychological anonymity. It is obvious to the re- 
spondent that he can easily be identified, and it is safe to say that he 
seldom really feels anonymous in the situation. The interviewer and the 
respondent have developed a relationship which, although transient, has 
identified the respondent in some respects to the interviewer. He is 
present to the interviewer as a person, and, as we have discussed in the 
previous chapter, interactional effects may result from the mere exist- 
ence of a personal relationship. 

Complete anonymity is probably most closely approximated in group 
administered questionnaire studies, involving unsigned questionnaires." 
The empirical evidence from self-administered questionnaires under- 
scores the complexity of the problem of situational factors. While the 
weight of evidence establishes a particular type of change when re- 
spondents are identified, qualifications become evident when the results 
of studies are compared. In studies in the field of personality or clinical 
psychology, different results are generally obtained when questionnaires 
or rating scales require the respondent’s signature. The experiments 1n 
this area by Maller, Olson, and Fischer show consistent differences of 
varying significance. : 

In Maller's study of co-operativeness in children, he found large dif- 
ferences in the ratings given to themselves and others by children asked 
to rate the group members for ‘“‘co-operativeness.”** When the question- 
naires were signed, Maller found an increase in the number of other 
children rated as co-operative, suggesting that the reactions when ques- 
tionnaires were unsigned represented more "genuine" reactions. 

Olson also found differences in responses to unsigned as opposed to 
signed questionnaires, using the Woodworth-Mathews Personal Data 
Sheet. Subjects were more likely to admit statements of “feelings” as- 
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tion of scale scores on a given subject but a far less striking difference 
occurred between the two groups in terms of distribution of responses 
to one or more other ites in the scale.” 

Maller refers in his previously cited study to a similar variation in 
the effect of anonymity on different subject matters. The particular 
variation, however, points to a fundamental clarification of processes 
that work in two Opposing directions under conditions of anonymity. 
Although, under conditions of anonymity, children were more in- 
clined to rate others more critically, they also rated themselves more 
favorably (as more co-operative). Thus, while anonymity seems to free 
the respondent from fear of reprisal for criticizing others, it also seems 
to free him of inhibitions about inflating his prestige. This latter effect 
of anonymity seems normally neglected in past discussions. For exam- 
ple, Kinsey went to such great lengths to preserve confidence out of 
concern that respondents, unless assured of anonymity, would not jen 
port unsanctioned sexual activities which would subject them to reprisal 
or court action or deflation of prestige. But he slighted the possibility 
that they might feel freer to boast about or to exaggerate sanctioned 
forms of sexual activity under conditions of anonymity. Hyman and 
Sheatsley, in commenting on this study cite such frequent illustrative 
quotations from Kinsey as "Cover-up is more easily accomplished than 
exaggeration in giving a history.” They demonstrate, however, by in- 
ternal examination of Kinsey’s data, that the errors actually were in both 
directions.” 

There is some evidence that anonymity is more of a problem under 
particular cultural conditions or in a given climate of opinion. Where 
there is any fear on the part of respondents of possible punishment for 
expressing certain opinions, anonymity would seem to be crucial. It 1s 
difficult to see how anonymity can be assured to such respondents 1n- 
terviewed in their own residence, since they are obviously identifiable 
to the interviewer, and could be located with ease. That such fears are 
operative within the population has been documented in the previous 
chapter. Anecdotal material from Japan further supports the notion that 
the situational factor of anonymity must be seen in the context of the 
culture. There was some indication in that society that surveys where 
names were not taken might be answered in a more frivolous fashion 
because of the Japanese experience that any serious inquiry in the 
involved the recording of names.” In addition to the larger climate, loca 
environmental differences and subcultural factors may be presumed to 
affect the importance of anonymity. Thus, among line troops in a dis- 
ciplined unit, it was usually necessary to stress the factor of anonymity; 
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In this case, the government is an affect-laden object—it is se om 
by a former enemy, so that we would expect that the effects of О 
ship demonstrated аге probably maximal estimates of the effect et » 
variable. Admittedly the very nature of the government in атом pA 
quite different from that of the usual case, and the results should hardly 
be taken as typical findings. р * 
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an inclination toward militarism?” the differences found (significant pa 
the .01 level) are in the opposite direction to what would be expected і 


the usual motivation were the sole one operating. Crespi explains thi 
Phenomenon as follows: 


The apparent MG sponsorship effects . , 
Tespondents tellin: 


tion six (above) 
all, when such а 


] sional 
- are all instances of p es 
g the American authorities what they like to hear. Que 

now suggests that this 
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Whatev : 
Shee cut орн bey may enter into responses under government 
than compensated f Шз Germany, Crespi feels that this may be more 
in Сепйеропвотеа y the reduction of other errors, which take place 
= sitchin Mte it surveys. There seemed to be less “no opinion” 
sored by the milita interest among respondents when surveys were spon- 
ents felt some in: ry goverment Under German sponsorship, respond- 
The ataten сы from an uncertain definition of the situation. 
mental is An inp 7 and interest when the sponsorship was govern- 
the i У, Crespi as common-sense realization on the part of 
їп a position to n ae that only the military government was really 

‘The alifii of = y some of the difficulties they faced. 
his Plinins rom а е sponsoring agency to take action with reference to 
SWers in other X |, an important influence on the respondent's an- 
Hofstein's des 26:08 involving the interview. This is pointed up by 
View. He E a of the structure of the army-counseling inter- 

es: 


the role of the personnel consultant 


ything without 


I h 
: e relati 5 

lationship to command defines 
jon. He cannot do an 


In any j RES 

the A pee. or counseling situat 

€ssional relate oval of the command. He cannot assume any role in his pro- 

first thought athe except that of a representative of the command. At 

May appear А E relationship so characteristic of and necessary to the Army 

the person l e limiting. Yet it is precisely because of this relationship that 
nel consultant can be helpful to individual soldiers. 


The 
studi " " М 
dies cited in the foregoing reveal that responses may be af- 


fec 

ted b 

the she 3 the stated sponsorship of the survey under conditions where 
ption of the role of the sponsoring agency is well-structured 


and 
releva 8 : 
Of refere nt to the issues posed in the questions. A dmittedly, some frame 
Nce is set in any survey situation, and the answers of respond- 
ame of reference. 


ents ar E 
hat oe pore only in terms of this fr: efer 
Uctured nds e crucial is the degree to which this frame 1s highly 
he differe what meaning it has for the respondent. | О 
€monstrati nce in results between the NORC and Crespi studies isa 
ation of the complexity of situational factors. What is nomi- 


Nally 

the : i 

ates different kind of situational factor, government sponsorship, oper- 

Of su ently in the two experiments because of the different meanings 
conditions. Further complexity 


ch А 
is oS ee lee under the respective 
ed by detailed findings within the German study. The type of 

d. It is dependent on whether 


Str 


ect i 

Sa А 

Stronger o function of the questions use ) 

Sovernm Pposing motives are set in operation. While inflation of pro- 
ed most frequently, on oc- 


ent: à 
al responses is the effect observ 


Casio 
п oth 
e " . 
г effects are noted. When the questions asked involve the 
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possibility of a remedy for existing difficulties, we find the personal 
needs of the respondents accentuated and criticism implicit in the an- 
swers. The lack of such effects in the NORC study may reflect the less 
severe need in the United States for government action to remedy exist- 
ing difficulties; it may also reflect the fact that the questionnaire did 
not touch closely on areas where governmental action may have been 
deemed necessary to remedy deeply felt frustrations. 

It is hoped that the foregoing discussion of these two situational fac- 
tors—anonymity and sponsorship—serves to point up the complexity 
of situational factors and the consequent difficulty of studying appro- 
priately their interplay with interviewer effects, Nevertheless, in study- 
ing interviewer effect as a function of situational factors, it is possible 
to demonstrate through properly designed experiments that differently 
structured situations may act as mediating agents for the introduction of 
bias. Although a multitude of factors may be operating in any given 
situation to induce interviewer effects, particular characteristics of given 
situations are frequently discernible as the probable basis for the occur- 
rence of these effects. And in controlled situations, the existence of 
these effects, originated by the interviewer but induced by situations, 
become capable of isolation and measurement. 


4. EFFECTS ARISING FROM SPECIFIC SITUATIONAL FACTORS 


It seems fruitful to attempt some kind of classification of situations 
according to the characteristic problems which they present to inter- 
viewers. Although there are many elements in the situation itself, there 
are only a few ways in which these factors mediate the operation of bias. 

We consider first the relation of situational structure per se to inter- 
viewer effect. All situations may be schematized along a continuum of 
the “degree of freedom” they permit the interviewer. Although distor- 
tion of data may arise from the imposition of a too rigidly structured 
interview situation, most of the evidence accumulated suggests that in- 
terviewer effect, in so far as it is related to the degree of structuring, 
arises from the lack of a well-defined and structured interview situation. 
Thus, we turn first to consider the nature of effects arising from situ- 
ations characterized by this quality. 


Effects Arising from Lack of Structure in Procedure 


The development of large-scale opinion and attitude research brought 
with it an increase in the degree to which forms of inquiry were struc- 
tured. The unstandardized type of interview, characteristic of clinical 
psychology, was of necessity unsuitable for large-scale research, for in 
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clinical studies the interview has as its primary purpose the diagnosis or 
therapy of an individual, while in survey research the analysis and re- 
porting of mass opinions or behavior and of group differences in these 
opinions is the principal objective. 

Just as it is essential that the clinician be enabled by his technique to 
pursue whatever lines of inquiry seem to him to be important in the 
individual case, so is it necessary in survey inquiries that the interviewer 
be prevented from following just whatever paths Бе may think impor- 
tant. The entire validity of survey procedure rests upon the foundation 
of standardization. If we wish to report and analyze and compare group 
data, we must make certain that the responses of the many individuals 
to the different interviewers are responses to essentially the same stimuli. 

Occasionally, there are research operations of a quasi-clinical type in 
which a few highly trained interviewers, homogeneous in background, 
work on a study and are guided by their uniform and thorough famili- 
arity with the research objectives. Under such conditions, the assump- 
tion might be warranted that each interviewer would employ techniques 
that were ideally suited to the given respondent and yet all would work 
in fairly parallel and unbiased fashion. But such an assumption seems 
hardly warranted for the usual large-scale survey in which the massive- 
Ness of the inquiry necessitates the use of large numbers of interviewers 
of unequal backgrounds so widely distributed geographically that con- 
trols are difficult to enforce. Moreover, in the former instance the inter- 
Viewer is at the same time often the analyst, and he can juxtapose the 
findings against his first-hand knowledge of the operations which elic- 
ited the data, It is essential in any analysis that the results must always 

e interpreted in terms of the measurement situation. Given the separa- 
Чоп between interviewer and analyst in the usual survey, the only way 
in which the analyst can know the nature of the field-setting is by 


Specifying it for the interviewers. 

_ Tf the stimulus situation is really vastly ] 

a survey (or even for a portion of them), then we cannot with good 

Conscience combine these responses into group opinions or make com- 

Parisons between groups of respondents. We cannot always be sure that 

the same questions do have the same meaning to different tespandents, 
here is empirical evidence that this is sometimes not the case.” More- 


Over, there ial i es where, on a priori grounds, diversity 
are special instanc 


among respondents is so marked that verbal standardization ме > pro- 
Vide no insurance of uniform meaning. Such might be the case, for ex- 


àmple, in 4 survey conducted in several different national | eue 
"It where diversity is not so striking, We can at least control the con- 


different for each respondent 
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ditions under which the questions are asked, so that in so far as pos- 
sible we mitigate any likelihood of obtaining uncombinable responses. 
Whether or not we can ultimately devise techniques to assure that a 
question will have the same psychological meaning to different respond- 
ents is beyond the scope of this discussion. Kinsey, by allowing his in- 
terviewers to use the terminology which they felt to be applicable, 
attempted to standardize the psychological meaning of a question by 
unstandardizing the wording. Certainly the possibility of adapting this 
technique to public opinion research deserves consideration. 

However, until such time as techniques are devised which make cer- 
tain that stimuli will be functionally standard for all respondents, re- 
search must rest upon the assumption that verbal standardization is the 
nearest reliable approximation we can achieve. Though frames of refer- 
ence may vary among respondents, it seems reasonable to suppose that 
the limits of the variation are closer if the verbal stimulus is standard- 
ized.?? 

While structuring of stimulus situations was originally developed as 
an aid to standardization in general, more important for our discussion 
is the control over interviewer effect which it provides. All other things 
being equal, the more controlled the interviewer's activities, the less the 
likelihood that variations in results can be attributed to the idiosyncra- 
sies of the different interviewers. Although it is, of course, possible to 
standardize an interview situation in such a way that we facilitate the 
introduction of some systematic bias among all interviewers, there can 
be little doubt that by giving the interviewer greater freedom in the 
interview situation we lay ourselves open to the infinite variability in 
human capacities that has been so well documented in psychological 
literature? 

Differences between interviewers come into play in all phases of the 
interview situation. Differences in intellectual capacities may mean varl- 
ation in understanding the objectives of the survey, the aims of the 
questions, and the meaning of responses. Sensory differences may lead 
to varying perceptions of significant respondent characteristics and to 
differential attentiveness to answers. Differential motor skills may result 
in recording differences. 

That mere interviewer ineptitude is itself a source of error is evident 
from experiments done under laboratory conditions with no respondent 
present. Here, clearly, errors cannot result from reactional processes. In 
such studies, we find that error which is merely clerical and not in any 
way motivated by bias can be quite large in magnitude. For example, i 
the study by Guest and Nuckols we find that for three experimental 
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phonographic transcriptions of interviews to which interviewers lis- 
tened and recorded responses the degree of such non-biasing error is 
45 per cent, 62 per cent and 66 per cent, respectively, of all errors com- 
mitted.” Further, the interviewers—although a homogeneous group of 
students from the same institution—varied considerably in the degree to 
which they made such errors; a fact which underlines the importance 
of differences in interviewer skills. In the Guest and Nuckols study, the 
Tange of non-biasing error among twenty-four interviewers was from 
three to sixteen errors, fully as great a range as was found for biasing 
errors, 

Beyond these differences in ability, however, there are others which 
may be even more important for the interview situation. Chief among 
these is the variation in “social stimulus value” among interviewers. 
Were interviewers selected from the population at large, such differ- 
ences would of course assume tremendous proportions. But it is true 
that interviewers as an occupational group tend to vary far less than the 
Population as a whole. The relative homogeneity of interviewers as a 
Broup, with respect to background characteristics, has been documented 
by Sheatslcy.?? While this itself may be a source of systematic bias, as 
discussed in the foregoing, it does limit the range within which indi- 
Vidual differences among interviewers may operate to distort answers. 

owever, if we examine some of the data collected by NORC on the 
Psychological characteristics of their interviewing staff, we find, even 
among this relatively homogencous group, differences in the extent and 
"ype of social relationship established with respondents. While demo- 
Sraphically they have much in common, psychologically they are fairly 


diverse, Consider the following data culled from 150 of NORC's current 
field staff: 


TABLE 43 


Comparison or Some FACTUAL AND ATTITUDINAL CHARACTERISTICS 
or 150 Norc INTERVIEWERS 


Percentage 
WOHER, am мы эзел каш eme s Ri SHEN нө n TARE Р ... 88 
MGB. se зан жыр stie nee cali tnnt n OS 12 
Notmaincarner...... isset 77 
MaN Barnea v ruren soni carie ner PREA eenia E REO 23 
ауес, „м. ди каулыны ie о NS HD n sa M 
No children, ss ¢ ca + sais seee oa na eae в sens VEE EES «xa BM 
Attended COMERS, Laus sonnia Sk PERE cocer ШЕР rn nne 81 


Never attended college... 19 
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TABLE 43 (Continued) 
Prefer to keep problems to ћегѕејуеѕ..................... 62 


Prefer to talk over with others 


Never get annoyed with respondents’ оріпіопѕ.......... 
Sometimes get annoyed 


Often feel like staying and chatting with respondent 


Seldom or never feel like staying. ssis cir бск кенв snes ss 
Have occasionally or frequently made friends with respondents. 58 


Never made friends with гевропдепї$.....................- 42 


* Factual data from Sheatsley (ibid.), attitudinal data from NORC's study of inter- 
viewers using the mail questionnaire described in Chapter II. 


Differences between interviewers in psychological characteristics and 
temperament may have crucial effects on the kind of interview situation 
in which they secure data. We might expect that rapport in the inter- 
view situation will vary and that the kind of spontaneous interaction 
that will exist between interviewer and respondent will likewise be sub- 
ject to wide variation. In the absence of a structure imposed by the 
agency, then, such personality differences as exist among interviewers 
will affect the way they themselves act out their role. 

The major consequence of structuring the interview is to impose re- 
straint upon variable tendencies among interviewers. There is the ac- 
companying danger of introducing some constant error through a stand- 
ardized, but misguided, procedure or an excessively artificial procedure. 
The unstructured procedure clearly allows full sway for variations 10 
interviewer behavior, but it may have the virtue of keeping any constant 
error due to bad or overly rigid procedures at a minimum. However, 1t 
should be noted that in addition to effects which result from lack of 
control over human variability in unstructured interview situations, such 
Situations may also permit, under special conditions, the maximum ope! 
à constant error among all interviewers. This would be the case 
me basic psychological process, common to the interviewers, 15 
а source of error unless controlled. Intelligent, standardized procedures 
designed in relation to such processes, can control or reduce constant 


errors. "That such processes occur very frequently is clear from earlier 
chapters. 


ation of 
when so 


Each of the many aspects of the public opinion interview is subject (0 
Structuring by the agency, That is, we can design the situation in SUC 
a way that the task of the interviewer is clearly defined and delimited, 
or We can, at any point in the process, order the situation so that the 
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interviewer’s judgments come into play. Within the realm of question 
construction, questions themselves may be narrow in focus or very 
broad. We may provide answer boxes in which two or three or more 
categories are provided for the interviewer to check the appropriate 
response, or we may ask the interviewer to record verbatim everything 
said by the respondent. Clearly the more we specify the task, the more 
we have structured the situation for the interviewer. 

In certain respects, the free-answer question would seem to provide 
maximal opportunity for the operation of interviewer effects deriving 
from lack of controls over variability in behavior. The tasks of asking 
the question and recording the answer are not nearly so rigidly defined 
as in pre-coded questions, since the interviewer must decide when and 
how often to probe, what probes to use, and what phrases in the total 
answer are redundant and can therefore be omitted from the recording. 
Consequently, studies of error in the use of such questions provide op- 
Portunity for evaluating effects occurring in unstructured situations. 
Nb ea recent studies, evidence is presented to demonstrate that 

‘ror in free-answer questions, when handled by the average inter- 
Erga is, in fact, of high frequency." Two specific ways in which 
n effects can be manifested form the focus of these studies—(1) se- 
ane recording of responses and (2) differential probing behavior 

g interviewers. 

"ts aforementioned study by Guest and Nuckols, twenty-four 
е айра аѕкеа со record interviews from three phonographic tran- 
анак concerned with labor-management relations. The three re- 
manage 5 recorded gave pre-arranged answers, one predominantly pro- 
d bur d one predominantly pro-labor, and one about neutral. Both 
about КАН type and free-response type questions were used. There were 
chances Picea chances for alternative type errors and twenty-six 

nl or free-response type errors. { к 
error Si i recording error on free-answer questions with similar 
answer Pre-coded questions, Guest and Nuckols conclude that free- 
lasin questions not only produce more total errors, but also more 

8 errors. 
seen г} neutral and pro-management, the proportion of biasing 

* Podés errors is about the same for both types of questions. On 
Pre-coded r interview, we find a fairly heavy pro-labor bias on the 
eel peeks and a rather heavy. pro-management bias on the 

as evident Tiestions, That pro-labor bias in the pro-labor interview 

oubtful ES y on the pre-coded questions suggests that assimilation 
sponses to attitude-structure expectations 15 characteristic 
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of interviewers using pre-coded questions, while other sources of bias 
operate more strongly under the free-response form. . 
Guest and Nuckols suggest that on free-response questions interview- 
ers tend to make errors away from the dominant theme of the interview. 
Although we have no empirical knowledge of why this рене 
occurs, it seems logical thar in free-response questions interviewers 


TABLE 44 
Tyre оғ Error as А Function or Type or Question AND TYPE or INTERVIEW (IN 
Per Cent) 
Tyre or Question 
Fixed Alternative Free Response 
Error tx Direction or: 
Man- Man- 
Labor | age- [Neutral] Total | Labor | age- |Neutral| Total 
ment ment 
—— 
Кешр їп direction 
of: 
Шабар: асыла 55 | 10 | ss | 100 | 47 | 51 100 
(29) (47) 
Management....... 29 12 59 100 33 3 64 to 
(34) 
Neutral. .......... 18 | 14 | 68 | 100 | 22 | 14 | 64 | 100 
(71) (28) 


might tend to omit recording repeated statements of a given theme. 
Thus, if a particular sentime 


: nt is once expressed and recorded, inter- 
viewers might select Contrary or separate themes to record rather than 
repetitions of the already recorded theme. If this occurred in Guest $ 
and Nuckols' experiment, it would account for their finding that inter- 
Viewers tend to record responses away from the dominant theme of the 
interview, 

Although Guest and Nu 
recording of free- 
er's ideology, Fis 
laboratory experi 
Chicago. Measu 
Fisher found a s 
ology and his sel 

Interviewers 
with their own 
favoring Wallaci 
of the possible ci 
cent more of th 


ckols found no evidence that the sclective 
answer material was in the direction of the interview- 
her has been able to demonstrate such effects in 4 
ment of similar design conducted at the University of 
ring the degree of error in free-answer questions only, 
ignificant relationship between the interviewer's ide- 
ection of phrases to record. (See Table 45.) 

tended to record more Statements which conformed 
attitudes toward the two controversial issues. Those 
€ recorded 12 per cent more of the possible pro ins 
on statements: those opposing Wallace recorded 4 per 
€ possible con statements than of the possible pro ga 
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ments; those favoring the draft recorded 5 per cent more pro state- 
ments; and those opposing the draft recorded 16 per cent more con 
statements. 

Р Differences in the types of responses most subject to interviewer ef- 
ect are provided by Fishers data, and confirm findings from other 


TABLE +5 
na Өт ө; AND NUMBER OF CON STATEMENTS RECORDED BY THIRTY-TWO INTER- 
м Dnarr лхо Warrace Issues Іх ReLarioN to Tuei Own Position on 
Tuese Issurs* 


T | 


T 
otal number of statements pos- 
sible to record 


Interviewers Wano 
Глуокер DRAFT 


Interviewers Мно 
Orrosrp Drart 


Pro-Draft Con-Draft Pro-Draft Con-Draft 
Statements Statements Statements Statements 
279 279 713 713 
169 156 309 424 
ól 56 43 59 


Interviewers Мно 
Favorep WALLACE 


Interviewers Мно 
Орроѕер WALLACE 


Pro-Wallace | Con-Wallace | Pro-Wallace | Con-Wallace 
S| Statements Statements Statements Statements 
T 
ij Dumber of statements pos- 
to record 624 608 624 608 
331 261 331 349 
55 43 53 57 


"T 
Otal ay, 
erage number of statements recorded: 53 per cent. 


UE. Significantly more biasing error on free-answer questions 
equivocal pus" when responses were equivocal, rather than un- 
regard to à Other data reported below suggest е this is also true with 
Paste: as [он from pre-coded questions. | 
аз well as im ence of the existence of effects in free-response questions, 
vided in а fi guste into the manifestations of these effects, is ps 
Stal of - experiment reported by Feldman, Hyman, and Hart. 
each, and m orty-five interviewers was divided into five teams of nine 
investigator oe team received equivalent ee pag ые 
type” eons little evidence of effects on tra litional к ing 
the same qu s yet a good deal on free-response questions inc! uded in 
ing thee The errors seemed traceable to differential prob- 
: экеа һе Чага аге presented in Table 46. | 
probing behavior is revealed in this study, first of all, in 


i 
the 
num : ; 
ber of separate answers elicited by interviewers. Here we find 
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significant differences among interviewers working within the ше 
sectors of a city, with equivalent samples, on all four questions tested. 
Elicitation of multiple answers seemed to be related to the experience 
of the interviewer; by and large those interviewers with the greatest 
experience tended to elicit more multiple answers. 


TABLE 46 


LEVEL oF SIGNIFICANCE OF INTERVIEWER VARIATION IN NUMBER OF ANSWERS OBTAINED 
on OPEN QUESTIONS 


Question Sector I Sector II | Sector III Sector IV Sector V 


Suggestions for improvements 
in Denver... 


nonsig- .05 .05 01 nonsig- 

nificant nificant 

Reason for moving to Denver*.| nonsig- nonsig- .05 nonsig- gett Я 
nificant | пібсапс nificant | nifican! 


Reasons for attitude toward 
neighborhood for satisfied | 5 
LOUD ana son ma кя мааа 01 nonsig- nonsig- nonsig- EU 


nificant | nificant | nificant 
Reasons for attitude toward 


neighbors] .« «e ans еы e nonsig- | nonsig- 01 nonsig- 01 
nificant | nificant nificant 


* While the F-ratios (variance between interviewers /variance within interviewers) do not reach the .05 Аа 
of significance in four of the sectors, the P-values are relatively low. When the exact P-values from gee 
sectors are combined to get an aggregate test by using Fisher's logarithmic transformation, the difference а xd 
interviewers in the aggregate is significant at the .05 level. For the other questions, no exact test Was та н 
the five sectors aggregated because the over-all significance should be clear from mere inspection, ап 
laborious procedure was unnecessary. 


f The number of respondents dissatisfied with their neighbors or neighborhood were too few in the total sample 
to permit any separate test of interviewer differences in number of reasons for this attitude. 

Perhaps even more important from the point of view of validity of 
data secured through free-response questions is the finding of Feldman 
and his associates that the tendency to elicit multiple answers affects the 
degree to which interviewers obtain answers whose contents are “rare. 
The data are presented in Table 47, 


In pointing out the importance of this phenomenon for the interpr eta- 


tion of survey data secured through free-response questioning, Feldman, 
Hyman and Hart state: 


, In drawing conclusions from survey data, it is common practice to use the 
infrequent occurrence of certain categories as a basis for interpretation. П 
all probability, the results of such categories, involving secondary ор. id 
are biased in the direction of under-representation because of the likelihoo 
that at least some interviewers did not elic 
portant, such an overall set of data will 
opinions and secondary opinions due to the 
in ability to obtain multiple answers, If 
habits are not distributed evenly over th 


it multiple answers. More не 

contain а mixture of primary 
variation among the intervie 
interviewers varying in PIO hat 
e entire sample, it is likely 
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i obtained differences between types of respondents may not be real 
ifferences, but merely differences in the degree to which secondary 
opinions have been elicited. 


A demonstration of the operation of effects on primary categories of 
response in free-answer questions (i.e, very prevalent attitudes) is also 
provided by this study. In this instance, the mechanism of differential 
probing seems irrelevant to the ability to obtain responses of such pri- 


TABLE 47 


Tue RELATIONSHIP BETWEEN THE NUMBER OF ANSWERS PER RESPONDENT 
OBTAINED nv AN INTERVIEWER AND THE PERCENTAGE OF RESPONDENTS 
Giving Answers IN A PARTICULAR SECONDARY CATEGORY 
(IMPROVEMENTS IN INDUSTRY AND CoMMERCE) 


PERCENTAGE or RESPONDENTS GIVING 
Answers IN THE SECONDARY CATEGORY 
or RESPONDENTS OF: 
The Three The Three D " 
Interviewers Interviewers Р iii ia 
Getting the Getting the ERGENEAGES 
Largest Number of | Smallest Number of 
Answers Per Answers Per 
Respondent in Respondent in 
‘Their Sector Their Sector 
Sector i P 24 7 17 
Ser. 18 12 6 
ector III. . -1 
ease aiae 12 13 
eat IVa cava il 20 7 13 
СЕБЕ... ага 15 4 11 
Allsectors........ 18 8 10 


come oe any given respondent. The study sought the explanation in 
Sich he mechanism. The authors present suggestive evidence that 
expectation. are independent of extent of probing and are a function of 
response ЕР i.e., the interviewer's belief that a particular category of 
Within the important somehow affects his tendency to obtain answers 
important i category. Interviewers who regarded neighbors as very 
who mentos 2 aS of neighborhood tended to have more respondents 
than those зе neighbors as a reason for their choice of neighborhood 
ut the ен who regarded neighbors as of little importance, 
of the te ды were not statistically significant. However, in view 
Seems wise ар апа the consistent direction of the findings, it 
even on pu ot to reject the possibility that interviewer effects occur 
B "penc categories of response to free-answer questions. — 

are subject С presented establish the fact that free-answer questions 
"уда xa considerable interviewer error, arising both from inter- 

iability and from the systematic operation of psychologi- 
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cal processes. Thus, the findings lend general support to the notion that 
unstructured procedures may provide a favorable milieu for the opera- 
tion of interviewer effects. 

We have described an “unstructured situation” as one in which the 
maximum opportunity exists for variations in interviewer activity. From 
this point of view, the procedure of asking interviewers to make “field 
ratings” of various respondent characteristics would seem to be a pro- 
cedure in which minimal structuring exists, as far as the judgmental 
requirements of the task are concerned. True, the categories are pro- 
vided (or points on the scale) as in pre-coded questions, but the inter- 
viewer is not “tied down” to the classification of a particular response. 
Since the respondent makes no “response” as such, but is classified ac- 
cording to some general observed characteristic, the subjective judg- 
ment of the interviewer is allowed free play. In such a situation, one 
would expect effects to be maximal. 

In the aforementioned study by Feldman and associates, the most 
striking occurrence of interviewer effects was noted in the variation 1n 
field ratings. Six such ratings were tested, and five “yielded P values so 
microscopic that the results certainly cannot be attributed to sampling 
variation.” 


TABLE 48 
Tests or Interviewer Errects on Frecp RATINGS 
Pooled Pooled Е 
Chi-Squared Degrees of Probability 
Value Freedom 
—— 
Condition of dwelling. . . ...... ...... 193.78 120 «.0001 
Condition of block 169.89 80 <.0001 
Degree of hostility of respondent... . . . 125.56 80 0007 
Degree of respondent's interest. ....... 151.01 64 «.0001 
Respondent's intelligence 214.73 120 <.0001 
Respondent’s evasiveness 48.14 40 18 


It is worth noting that even ratings of “factual” characteristics showed 
immense variability. The authors point out that differences in ratings O 
qualities such as “intelligence” or “hostility” might reflect actual differ- 
ences in interviewer-respondent interaction, but ratings of “condition 0 
dwelling unit” and “condition of block” must represent sheer inter- 
viewer differences, under controlled sampling conditions.“ 

We must conclude that the task of making field ratings of respond- 
ents or of environmental characteristics presents the type of situation In 
iih interviewer effects are maximized, and resultant data highly 027 
reliable. 
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Of course, interviewer effect is only one of many considerations 
which a designer of surveys must take into account. Thus, where field 
ratings are indispensable for the purposes of a study, we cannot demand 
that they be sacrificed simply on the grounds of such imperfection. 
Similarly, open-ended questions may often be indispensable for reveal- 
ing certain variables not amenable to study in other ways. In such in- 
stances, susceptibility to interviewer effect may become a secondary 
consideration in the choice of a procedure. 

However, when such methods are applied, our findings caution us to 
be especially attentive to interviewer effects and to institute careful 
measures of control. Our findings also suggest that control may have to 
take the form—in part, at least—of more enlightened and effective 
structuring of the interview situation. 


Effects Arising from Increased Opportunity for a Respondent Reaction 


е In Chapter IV, it will be recalled, we documented the inference that 
"ull reactions of respondents are likely to result from the extent and 
uin ды of their social involvement in the interviewing situation. Al- 
lidiy y cn is heightened by both task and social involvement, va- 
не Aa e respondents answers to questions seems to depend on the 
Bi ent of a nice balance between task azd social involvement. 
a be expected to come into play in any situation in which we 
«беса SIMI one or more of the factors which facilitate reactional 
p I heoretically, this may occur at any point in the process. 
ён ц are much concerned about the perceptions which respond- 
Viewer t. y develop of interviewers. Thus the mere fact that an inter- 
upa n n a door and gives some introductory speech might set 
Binden: ency in the respondent to perceive him asa salesman. Agencies 
mt nterviewers against dressing or behaving in any way that might 
ai ү deviant kind of perception in the respondent. The inter- 
friend] supposed to dress inconspicuously and adopt a uniformly 
y and informal manner in his approach. 
e we have a great deal of data on the existence of reactional 
traced P т se, we have very few experiments where such effects can be 
the data еа factors. One of the few such tests is available from 
the 1948 o ected by Mosteller and his Associates 1n the SSRC study of 
pre-election polls.“ 

ча ee of results for several interviewers using both secret and 
eiie ots provided us with a test in which the actual role of the 
а, a altered in two ways. We have, first of all, a comparison 
ations where the question is verbalized by the interviewer, 
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and one in which it is handed to the respondent on a written ballot. 
Secondly, we have a comparison of situations in which the respondent's 
opinion is made known to the interviewer or kept secret from him. In 
accordance with the theory previously stated, we would expect that 
when the interviewer verbalizes the question he would automatically 
occupy a larger part of the psychological field and therefore induce 
more effects. Also we would suppose that when the respondent is al- 
lowed to keep his opinion secret there will be less social involvement, 
due to a lesser concern for the characteristics of the interviewer, and his 
anticipated approval or disapproval of the responses. 

The data from this study, however, do not bear out our hypothesis. 
Comparing the results secured by the two different Gallup interviewers 
working successively in two cities, we find that, for each of them 1n 
each city, the results obtained under the two methods—secret and non- 
secret—did not vary significantly. 

However, earlier experiments of the AIPO with secret ballot tech- 
niques suggest that despite the personal presence of the interviewer, 
differences in results do occur on some items among urban groups when 
the respondent’s answers are not revealed. Turnbull finds large and sig- 
nificant differences on questions in which the respondent's prestige 
might be affected and small and nonsignificant differences in other ques- 
tions when the secret ballot is used. Kemper and Thorndike report 
similar findings from a survey of 1,000 men in the city of Louisville. 
Student interviewers, many of them with past experience, inquired 
about the respondent's psychosomatic symptoms, using personal inter- 
view and secret ballot techniques. Presumably the revelation of a symp- 
tom would be prestige deflating. Significant differences were found for 
six of the twenty-two questions, with the secret ballot yielding a more 
frequent report of “maladjustment” in five of these instances. The 
writers note, however, that the difference in average adjustment, pre- 
sumably computed from the total scale score, between the two methods 
was small. 

Another test of the same general phenomenon is report 
who tested interviewer respondent agreement in opinions 
of interviewer effect) in two situations. In one, an ordinary рё 
interview was conducted, and in the other, the questionnaire wa 
with the respondent “to think about,” the interviewer returning at 4 
later date to conduct the interview. Presumably, there should be less 
social and more task involvement in the latter situation, since the re- 
spondent has had more time to become involved in the task itself, and 


ed by Huth,” 
(as a measure 
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S left 
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is in a sense “fortified” against effects deriving from the perception of 
the interviewer characteristics. 

On two of seven questions—those dealing with a state veterans’ bonus 
and peacetime military training—she found significant association be- 
tween interviewer and respondent opinion in the nondeliberative situa- 
tions, while only the bonus question showed significant association in 
the deliberative situations. The other five questions were concerned 
with prohibition, attitudes toward Negroes, war possibilities, interest in 
voting, and need for more industries in Denver. 

{ Although the study tested only a small number of questions for inter- 
viewer effect, the results are most consistent. Moreover, the issues posed 
—e.g., drinking, race relations, voting—seem to be highly loaded with 
social content and therefore susceptible to reactional effects. Yet out 
of seven tests, in only one case was there a significant interviewer effect 
observable under the nondeliberative condition that was not also ob- 
servable under the deliberative condition. On all other questions, inter- 
viewer effect was either absent or present under both conditions. 

. The lack of demonstrable effect in these specific tests of our hypothe- 
sis does not deny its general validity. Although we do not find bias 
measurably increased in situations where the interviewer is presumably 
Occupying a larger portion of the psychological field, it is probable that, 
even where the interviewer had merely provided a secret ballot for the 
respondent, the social involvement is sufficiently large to approximate a 
more interpersonal situation. For, if bias could have occurred as a func- 
Чоп of respondent reaction to perceived group membership or other 
Characteristics of the interviewer, this would function in independence 
of any verbalization by the interviewer. Although the respondent's bal- 
Ot is secret, there may not be psychological anonymity for him so long 
as there exists a face-to-face relationship with the interviewer. 
ыш» in Chapter IV provide ample evidence of the hypothesis that 

rally defined significance of the interviewer's characteristics 1s 

à potent source of bias, We have seen that differences occur as a result 

oo the respondent's perceiving the interviewer's color, religion, sex, class 

a embership, and residence and his reacting in some emotional way to 
€ characteristic, 

„Tt will be recalled, however, that the effects noted were not uniform. 
ine differences were discernible on some questions and not on 
eae | ог several of the studies discussed. Thus, in addition to the pro- 
ioa E of questioning or question form, question content may be a most 

ant factor in the mediation of reactional effects. Where question 
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content does not relate in some way to the group membership of re- 
spondent and/or interviewer, we would not expect reactional effects, 
but where the relationship between the content of the question and the 
group membership factor is clearly evident, reactional effects may be 
expected to be maximal. This difference would come under our cate- 
gory of situational differences. 

This factor is illustrated by the comparison of questions from the 
study of Negro and white interviewers in Memphis, discussed in Chap- 
ter IV. In Table 31 the questions were classified by the level of sig- 
nificance of difference. 

Looking back at this table, we see that questions with particular types 
of content are more likely to show differences. First of all, it is clear 
that, on most of the nonattitudinal questions, differences between the 
groups are not significant, whereas on the attitudinal questions most 
differences are highly significant. The only nonattitudinal questions on 
which differences are significant are the questions referring to automo- 
bile ownership, and the Negro newspaper read. Negro respondents were 
less willing to admit owning an automobile or reading a Negro news- 
paper when interviewed by white interviewers. While these questions 
are factual, it is obvious that they are clearly related to the problems 
raised by group membership. Negro respondents in the South are aware 
that white Southerners may frown on any signs of Negro affluence and 
might prefer that Negroes read the local “white” newspapers. 

A study of the summary also reveals that questions which in any way 
attempt to measure attitude toward the “government” or the conduct 
of the war produce the most significant differences. 
seem to be very careful to avoid any suggestion that they migh 
patriotic” or dissatisfied with government policies when talk С 
white interviewers. This is especially noticeable on the question asking 
who is to blame for job discrimination against Negroes. They are just 
as willing to blame managers and labor unions when talking with white 
interviewers as with Negro interviewers but are considerably less will- 
ing to blame the government when interviewed by whites. Likewise, 
protests over segregation are significantly more often mentioned by 
Negroes when talking with Negro interviewers, while complaints about 
“housing” are the more frequent response given to white interviewers 
in answer to the question, “What do Negroes feel worst about?” 

These data document the importance of question content in the intro- 
duction of reactional effects. When the respondent is affected by the 
group membership of the interviewer, his answers will be affected on 
questions which are in some way related to the area of group mem ee 


The respondents 
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ship. In general, the Memphis study indicates that the further removed 
the question is from problems of Negro-white relations in the South, 
the less likely it is that reactional effect will occur. 

_ Clearly, lack of structuring in the interview and respondent conform- 
ity to the perceived social requirements of the interview situation are 
not the only channels through which situational factors bring about 
bias. Constant bias over the staff may well result from the construction 
and standardization of a particular kind of biasing situation by the 
agency, This may come about by more or less direct means (such as 
the construction of badly worded questions), or by indirect means such 
as the setting of a type of situation that presents the interviewer with a 
task which either mechanically or psychologically involves certain dif- 
ficulties. In such cases, bias seems to arise from the attempt of inter- 
viewers to solve the problems with which they are faced. That such 
tasks need not necessarily be taxing to the interviewer, nor that he need 
even be aware that he faces a difficult task, is clear from the data which 
will be presented below. We consider first situations illustrative of 
mechanical difficulties for interviewers and the way in which effects 
may come into play as a task aid. 


Effects Arising from Mechanical Difficulties of the Task 


_ When demands made upon the interviewer are beyond what is realis- 
Ucally attainable, it may be presumed that the data are affected. For, as 
revealed by the case material in Chapter II, interviewers normally accept 
and fulfil their prescribed role, but when pressures become too great, 
they may be unable to maintain it. Occasionally the mere mechanical 
difficulties are so great that demoralization sets in, and interviewers con- 
Sclously or unconsciously distort data so as to enable them to comply 
with the mechanical requirements of the task. Crespi in his discussion 
vi interviewer cheating states that demoralizing demands on the inter- 
sus | аге the primary causes of cheating behavior.“ He lists as com- 
m €moralizers such features as unreasonable length of questionnaires, 
dife frequent probes, apparent repetition of questions, complex and 
Sami t or antagonizing questions, part-time work, and overly difficult 
e assignments. In addition, he mentions external factors, such as 
a er and transportation difficulties, as causing interviewer demoral- 
Adee of interviewer report forms has led Sheatsley to con- 

at similar factors are prime causes of low interviewer morale.!* 

tities a innocuous features of a questionnaire may conceivably 
ун iiculty and affect responses. For example, according to Payne, 
n be demonstrated that the amount of white space allowed for the 
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written responses is sufficient to affect the length of the response re- 
ceived on free-answer questions.” This theory is supported by qualita- 
tive evidence gleaned from interviewers, one of whom, in recording an 
interview from a phonograph record, stated: “I feel irritated; I have no 
room—have to write all over the place. How can you write verbatim 
when there is no place to write verbatim? . . . I get doubtful—am I 
writing down the things which are really important? I may not be ob- 
jective in that I’m picking out certain things and leaving out others.” 

However, in one empirical test of this phenomenon, Fisher reports 
that the amount of space made no difference in the number of state- 
ments recorded in response to free-answer questions.* He found that 
interviewers would simply write smaller or write in the margins, where 
space was limited. 

The experienced difficulty of specific situational factors must, of 
course, be qualified in the light of our earlier remarks about the recruit- 
ment and training of interviewers who would be capable of greater 
frustration tolerance, and the fact that the larger survey requirements 
may necessitate using unpleasant procedures. 

When such difficult situations occur, we would not normally expect 
any systematic bias over the whole staff to be evident. Rather we an- 
ticipate diffuse errors in the data, since the only psychological process at 
work is the interviewer's desire to extricate himself from a difficult 
situation, and often he can do this in a variety of ways. However, if 
there is only one path which any interviewer may take to reduce the 
difficulty of the task then one would expect systematic errors to result. 
For example, difficult interviewing situations might frequently lead to 
inadequate probing by interviewers, so we might expect a greater fre- 
quency of "don't know" or “no answer” responses in such situations; 
or, in free-answer questions, a smaller frequency of secondary types 9 
responses. When frank cheating does not occur in difficult siuations, We 
might expect a high degree of random error. Guest and Nuckols have 
shown the degree of non-biasing error which occurs even in a simulated 
easy interview situation; we might expect this to be greatly magnifie 
when the requirements of the task are made more difficult. 

It is probably true, however, that if we constructed the intervie 
situation in such a way that the fulfilment of the task was too simple an 
mechanical, we might also find an increase in cheating or random error. 
There is considerable evidence in psychological literature to demon- 
strate that, up to a point, an increase in task difficulty makes for in- 
creased efficiency and accuracy.” As well, some experienced inter- 
viewers have a certain "instinct for workmanship"—a certain sense О 
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professional artistry—and might feel relegated to a minor clerical role 
by extremely simple tasks; consequently, error might result from a de- 
crease in the interviewer’s motivation for the assignment. Also, there is 
some evidence from NORC's survey of interviewers that research direc- 
tors may underestimate the ability of the experienced interviewer to 
carry out difficult assignments.” 

For example, in answer to the question: “How do you feel when 
someone refuses to let you interview them, or meets your approach with 
hostility?" 20 per cent of the inexperienced interviewers in NORC’s 
study reported intense feelings of rejection and 8 per cent saw it asa 
personal failure, whereas only 12 per cent of the experienced reported 
intense feelings of rejection, and none saw it as a personal failure. Like- 
Wise, while only 6 per cent of the inexperienced responded to such 
Situations as a “challenge to get the interview,” 18 per cent of the ex- 
Perienced group perceived the situation in this way. 

The contrast between experienced and inexperienced interviewers in 
their willingness to carry out all kinds of assignments is further re- 
vealed in answer to a subsequent question on the NORC study: “How 
much difference does the content of the survey make to you? That is, 
аге you just as happy asking about one subject as another, or does your 
interest in the work vary a great deal depending on what we are asking 
abot t?” Here, sharp differences between the experienced and inex- 
Perienced groups are revealed. While 54 per cent of the inexperienced 
group say their interest depends on the subject of the survey and 38 
per cent say it makes no difference, the proportions are almost exactly 
the Opposite for the experienced group—36 per cent saying it depends 
on the subject and fully 60 per cent saying it makes no difference. 

One field experiment conducted by NORC and reported by Sheatsley 
illustrates the resistance of professional interviewers to temptations to 
Simplify their taslc 5! In a test deliberately designed to “trap” the inter- 
viewer into recording the response which would save him from asking 
а series of annoying subquestions, no evidence was found in the aggre- 
td of any distortion of data through such attempts to simplify the 
ee m was as follows: A survey in February contained a ques- 
Foie, ch suggested that the federal government might not have 
коз ЧЕ, money to do all the things it would like to do and the respond- 
might apu a choice of two groups of services on which less money 
en La : s Spent—"A^ or “В.” The same question was repeated on a 
айды the following month, but this time four subquestions were 

» and a split ballot was used. On half the ballots, four tedious sub- 
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questions were asked of those who favored a cut in “A,” and nothing 
was asked of the “don’t know” or those who favored a cut in “В.” On 
the other half, four subquestions were asked of those who wanted to 
cut down on “B.” The samples were equivalent with each interviewer 
using each form on half of his respondents at random. The hypothesis 
would be confirmed if there were a higher “don’t know” response in 
March, which would be one way to avoid asking the subquestions, and 
if there were a higher response on “A” when the subquestions applied 
to the “B” answer, and a higher response on “B” when the subquestions 
applied to the “A” answer. The results presented in Table 49 below 
provide no evidence whatsoever of such biasing behavior. 


TABLE 49 


Tue INFLUENCE or DEPENDENT SUB-QUESTIONS ON DISTORTION oF RESPONSES TO AN 
ORIGINAL QUESTION 


Percentace or RrsuLTS WHEN 
Sunquestions Woutp Have TO BE 
FEBRUARY "Tora, ResuLTs ASKED ONLY IF: 
Response Survey Marcu Survey 

(IN PER CENT) (IN PER CENT) Git Downon Cut Down on 

“A” Answer “В” Answer 

“Cut down on А”.... 62 64 66 62 
“Cut down on В”..... 25 27 25 28 
"Don't know" ........ 13 9 9 10 
100 100 100 100 

N = 1261 | N = 1302 N = 654 N = 648 


Effects Arising from Psychological Difficulties of the Task Assigned 


Just as some interviewing situations present the interviewer with 
difficult problems arising from the mechanical procedures prescribed, 
so certain types of situations present psychological difficulties to the 
interviewer that are most easily solved by distortion of data in one way 
or another. 

Demoralization, while it may result from mechanical difficulty, may 
also come about through the prescription of an intrinsically simple task 
which the interviewer finds it difficult to perform psychologically. The 
description by James Stern, cited in Chapter II, of the tension Һе ex- 
perienced in questioning Germans about their reactions to the strategic 
bombing is an example of a kind of general demoralization which may 
occur because of inability psychologically to accept the task assigned. 
Other interviewers have reported similar experiences. One of them, as" 
signed to obtain a detailed interview on the leisure-time activities O 
respondents, reported that it was extremely difficult for him to carry 
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out this task when interviewing a working-class housewife with five 
small children. Clearly this respondent had little leisure time and many 
pressing problems, and the interviewer stated that he felt ridiculous in 
asking how she spent her “leisure hours.” It is likely that some inter- 
viewers will fabricate data rather than continue in this kind of trying 
situation. 

A similar demoralization occurs when the requirements of the survey 
аге such as to cause resentment, embarrassment, or even apathy among 
respondents. This type of situation is one which Crespi lists as a source 
of cheating behavior, and it is evident from hidden recordings of inter- 
views, obtained during a study by the American Jewish Committee, 
described in Chapter II, that where respondents exhibit hostility to the 
Survey, varying kinds of distortions are introduced by the interviewer. 
In these experiments with “planted” hostile respondents, interviewers 
failed to repeat questions and occasionally skipped whole batteries of 
questions which might have reinforced the respondent’s already ex- 
Pressed hostility. Other interviewers biased data by readily agreeing 
With the respondents’ criticisms of the survey, in an apparent attempt 
to ease the tension in the social situation. 

In the examples described above we have a conflict between the de- 
Mands of the job and the demands inherent in the personal relationship 
ОЁ the interview situation. When an interviewer's task motivation is 
9W and his social orientation especially intense, we may expect the 
Social requirements to take priority in resolving the conflict. However, 

ecause the maintenance of at least a tolerable social relationship is a 
Prerequisite for conducting any interview, the establishment of rapport 
> always a task requirement as well as a social requirement. Conse- 
y, we frequently find that interviewers will sacrifice an estab- 
mo Procedure if they feel rapport is jeopardized. Thus interviewer 

› Some of whose reactions while listening to a phonograph recording 

ап interview were reported earlier, remarked in the same experiment: 


J started to get that helpless feeling, he did not answer the question and I 


him forcing the answer out of him. You have to force him but as you force 
"e reacts by feeling more strongly. . N А 
снна F not be sure what the answer is . . . so you oe — the 
to Woe then the respondent is up in arms and says “Didn’t you listen 
said?” 
Sore Now that he takes some interest in the Berlin question but he’s getting 
aae they were on good terms the interviewer should probe that re- 


Mark К 
Of the Tespondent, but as it is, no probe is better. 


- 2 the social relationship can obviously be taxed by inquiries into 
un realms, the content of the questions asked can become an im- 
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portant situational determinant of this type of bias. Agencies have al- 
ways been aware that respondents objected to certain types of ques- 
tions and that they may fabricate answers when such occasions arise. 
But the focus of concern has been on the respondent as the source of 
the error. However, there is much evidence to demonstrate that, be- 
cause of anticipated objections, questions on certain subjects are asked 
reluctantly by interviewers, and that some interviewers might skip 
such questions entirely. In NORC’s study of interviewers ап attempt 
was made to explore interviewers’ concerns about asking questions on 
articular areas. About half the current staff, in answer to a direct ques- 
tion, indicated that they remembered questions on past surveys which 
they would have preferred not to ask. The table below summarizes the 
types of questions interviewers reported they preferred not to ask. 
The data in Table 50 reveal that so-called “factual” questions are 
among the ones most frequently objected to by the interviewers, раг- 
ticularly when they disclose the respondent's economic status. In stat- 
ing the reasons why they preferred not to ask particular types of ques- 
tions, interviewers indicated that they thought questions Were "too 
ersonal” or embarrassing to the interviewer or respondent. About а 
fifth of the interviewers said that respondents became hostile or susp! 
cious at certain questions and, hence, that rapport was endangerec. 
Others mentioned that they felt respondents didn't answer persona 


questions honestly. 
That questions about the respondent’s financial status are among those 


TABLE 50 
Frequency WITH WHICH INTERVIEWERS SpowrANEousLv Mention Dis 
TICULAR QuzsrioN TYPES 


LIKE OF PAR- 


Percentage of Interviewer? 


e Ue 
Type of question Who Express Dislike 


Questions relating to financial status; rent, income...... snm t 38 
Questions related tosex...... ttn 25 
Questions related to political preference. E 16 
Questions related to religious preference 9 
Questions related toage.....-. v vU eS 9 
Miscellaneous personal questions: mental health, physical welfare, t 
ВЕ а cadcm - enge eei Sca pretesa ИШГЕ RU dee " 
Factual data, personal questions generally. a. ses eere nnt H 
Questions related to inter-racial subjects... ...... tn À F 

Questions too difficult for respondent to understand... 
Miscellaneous: information, trend, card questions, questions that я 
meet with disinterest..... а seess seek t aens BEEE Кинен эы, К uo 
М = 76 


jon. 
" " А { questio! 
* The per cents add to more than 100 because some interviewers mentioned more than one type об 
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most objected to by interviewers is further documented by another set 
of questions asked of NORC interviewers.” In an attempt to find out 
what factors lay behind the objections of interviewers to particular 
types of questions, NORC formulated a list of specific questions, some 
pr eviously asked in surveys and others synthetic, and asked interviewers 
to imagine that they were to use these on a survey and to indicate which 
ones they would object to asking. Various question types were in- 
cluded, the purpose being to cover a wide range of possible objections. 
While it is not possible to tell exactly why interviewers objected to 
some of these questions, since we did not ask for their reasons, the 
grounds for objection can generally be inferred from the questions. 
Table 51 lists the questions inquired about and the percentage of inter- 
Viewers who stated that they would not object to asking them. Included 
in the table is our inference as to why the questions might prove ob- 

Jectionable to interviewers. ` 
Although the absolute percentages in Table 51 аге not necessarily 
reliable, since interviewers are likely to understate their objections to 
their employer in such a hypothetical test, the relative positions of the 
евр in terms of the frequency with which they meet objections 
S probably dependable. It will be noted that the questions about fi- 
ux again draw the most frequent objection, in spite of the fact that 
bip: questions included in the list tap extremely personal areas of in- 

estigation, 
ы. extent that interviewer effects result from reactions of de- 
Ss ation to the content of questions, we should expect as much 
"d n so-called factual data as in attitudinal data, and in many types 
un iei which are routinely used on surveys and regarded as in- 
m us, Apparently it is not only those surveys in which we ask about 
o£ 2. poe attitudes which present the interviewer with problems 
surve 2 ishing and maintaining rapport. Factual items on ordinary 
aske i. E ticularly, it would seem, where financial questions are 
interview, y in the interview) may threaten rapport, and may cause the 
which чу to introduce error in order to avoid the social difficulties 
he a might have to face by following his directions exactly. 
ioe ects of psychologically difficult situations, created by content 
i Bie probably similar to the effects deriving from mechanical 
now? 6s diffuse and random error with a likely increase in "don't 
Qui and "no answer" responses. 

беән ee from the general psychological problems of the inter- 
cho El for the interviewer, there are also many specific psy- 
problems that present themselves during the course of an 


TABLE 51 


Frequency or NORC Interviewers’ OBJECTIONS ТО CERTAIN QUESTIONS 


Hypothetical Question 


Presumed Reason for Objection 


Percentage of 
Staff Who 
State They 
Would Not 

Object* 


Who do you think is mainly re- 
sponsible for high prices in this 
country—the big businessman or 
the small businessman? 

Suppose Russia declared war on Yu- 
goslavia—about how long do you 
think the war would last—just 
your best guess? 

Who do you think is mainly respon- 
sible for strikes in this country— 
the workers or their leaders? 

Can you whistle? 


What religion do you consider your- 
self? 

Do = happen to know the capital 
of Syria? 

As you may know the Reciprocal 
Trade Act of 1946 provides that 
countries in the Western Hemi- 
sphere do not have to pay a tariff 
over 12 per cent on certain types 
of industrial commodities pro- 
vided they allow American goods 
the same privileges at their ports. 
Do you approve or disapprove of 
this policy? 

What is your approximate age? 


In the last election for President, did 
you vote for Dewey, Truman, 
Wallace, or Thurmond? 

Are there any policies of the Com- 
munist party which you yourself 
admire? 

How would you feel about marrying 
a Jew? 

Has anyone in your family ever been 
in a mental hospital? 

Have you provided for the Salvation 
Army in your will? 


Do you think masturbation can cause 
mental illness? 


What was the total income of your 
family last year? 


Loaded 


Requires respondent to make guess 
with little basis for judgment 


Loaded 
Innocuous but awkward to the inter- 
viewer because of absurdity of subject 
Embarrassing to respondent because 
of personal nature of subject 
Embarrassing to the respondent be- 
cause ignorance may be revealed 


Awkward to the interviewer because 
of length, complexity, general ig- 
norance of respondents on technical 
subjects 

Embarrassing to respondent because 
of personal nature of subject 


Embarrassing to respondent because 
of personal nature of subject 


Possibly incriminatory 

Embarrassing to respondent because 
answer may violate social credo 
Embarrassing to respondent because 
subject matter is generally taboo 
Embarrassing to interviewer because 
of absurdity for most respondents, or 
embarrassing to respondent because 
of personal nature of subject 
Embarrassing to both interviewer and 
respondent because subject matter 15 
generally taboo 

Embarrassing to respondent because 
of personal nature of subject 


* The percentages in this table are based on 150 interviewers. 
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interview. Chief among these, perhaps, are the individual judgments 
which he must make in the classification of the responses to pre-coded 
questions, Of course, in many, or perhaps most cases, there exists no 
problem, since the majority of answers to poll questions are usually 
classifiable in the terms required by the question. However, in the 
Course of completing his assignment the interviewer meets with many 
respondents whose answers are ambiguous, and who therefore present 
to the interviewer a psychological problem in making the necessary 
judgment in order to classify the answer.” It is well known from ex- 
Perimental studies that judgments of material which is not thoroughly 
objective and structured can be influenced by extraneous factors, and 
by the context in which the material is placed. Some of the opinions 
Teported to the interviewer may be affected by the same processes. In 
addition, it is known from other experimental studies involving the use 
of “absolute scales” that the meaning of categories on a scale is not 
rigid, and that the scale may be “anchored” differently for individual 
Judges depending on a variety of experimental factors.^ It would seem 
likely, therefore, that there would be opportunity for the interviewer's 
eliefs, attitudes, and idiosyncrasies to influence the way he defines 
the Categories and the task, and the way he makes the judgments en- 
tailed in classifying respondents’ answers. Indeed, it might even seem 
essential to the interviewer to simplify the difficult task he occasionally 
&Cesiby availing himself of various psychological aids to judgment. 
" 2 ond the judgmental problems in classifying answers, there may be 
ikel lvational factor present, which would presumably make bias more 
addis to occur when interviewers are required to classify responses. In 
ae to the unconscious factors that operate to influence judgment, 
with Ver conscious motivations there are to bias the results can operate 
^ greater freedom under such conditions. Should an interviewer 
V oin, or carelessly distort the results in the process of classifica- 
Riven 9 one in the home office can tell from the теге check mark in a 
the. ne answer box that such distortion has occurred." However, under 
Part e of verbatim recording, any bias or dishonesty on the 
the es >» Interviewer might more easily be detected by reference to 
complet "pes answers, or by the existence of patterned phrases in his 
vealed in Dm That interviewers may well realize 5 was re- 
asked to : e course of the experiment in which interviewers were 
which the cord a dummy interview and explain aloud the process by 
aced wee did their recording. As ane interviewer remarked when 
—there's аца a difficult answer: R. ou have to come to a decision 
ore of a tendency to decide there and less anxiety about 
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how to code it because the office does not know what the respondent 
said. There's no danger; the office can't decide whether I did right 
unless they make correlations and see that the particular answer doesn't 
fit in." 

Moreover, where responses must be classified into answer boxes, 
freedom for the interviewer is even sanctioned to some extent merely 
by the way the situation is defined in his preliminary instructions. For 
this method of recording, he is usually told to check "the answer that 
comes closest to the respondent’s opinion." But under conditions of 
verbatim recording, he is told to record "exactly what the respondent 
said." Since he is given much less leeway under the latter method, we 
would expect bias to be less in evidence. 

For all these reasons, it seemed fruitful to study this particular aspect 
of the interview situation. Under conditions of field classification, one 
might expect to find greater interviewer effects than under conditions 
of verbatim recording. . 

In an experiment conducted by NORC, the results secured for equiv- 
alent samples under contrasted methods of recording—classification 
versus verbatim report were compared." Since this was an attempt to 
test the effect of the classification procedure per se, not the question 
type, questions with stated alternatives were used in both situations, the 
only difference between them consisting of the requirement that the 
answers be classified into pre-coded answer boxes in one case and re- 
corded verbatim in the other. 

It was found that over-all survey results on the three attitudi 
tions tested were not affected by the process of field classificat 
that the distribution of results on the fourth question measuring leve 
of information was affected by field classification. Requiring ere 
ers to classify respondents’ level of information showed a lower over-al 
level of awareness than when the verbatim responses were later coded in 
the NORC offices. (See Table 52.) А 

For the total field staff, specific tests of effects deriving from inter- 
viewer expectations or interviewer ideology revealed no differences un- 
der the two procedures. The data from some of these tests are presente 
in Tables 53 and 54. Although in general the over-all effects due to 
classification were minimal, there was suggestive evidence that the re- 
sults obtained by inexperienced members of the staff were more 27 
fected by the classification procedure than those of the experience А 
On two of the four questions, the differences for inexperienced ne 
viewers were significant at the .01 level, and the aggregated Chi-squar á 
for all four questions gives a probability of only .01 that the difference 


nal ques- 
tion, but 
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TABLE 52 


Tue Variation іх Over-aLL Resutts Unner Two Метнорѕ or REcorpinc 


Percentage Percentage 

Classified by | ^ Recorded 

Interviewer Verbatim. 

US. spending too much on European Recovery Program. 43 39 
Spending about right amount U 38 38 
Spending not o ———HS es 4 5 
ЮО ауа au crap a аала caeci pe ninemsn quit 15 18 
100 100 

Heard about North Atlantic Pact 62 
ad nor heard about it. .. ecce seen 38 
100 

77 

12 

11 

100 

North Atlantic Pact makes war more likely 14 14 


" akes peace more likely 
makes no difference. . 
on't know 


TABLE 53 


Tu 
Е Errzcr or Interviewer’s Iprorocv ow RrsPoxpENT Opinions UNDER Two 
Metuops or Recorptnc* 


Prncgy CLASSIFIED BY INTERVIEWERS RECORDED VERBATIM 
Bk a [ges Anti- Pro- Anti- 
z Inter- Inter- Difference Inter- Inter- Difference 
viewers viewers viewers viewers 
Approve amount being 
Pent on overseas аїй....| 52 54 2 57 44 13 
PProve of the North 2 
“ol Есе, ш... 87 77 10 89 81 8 
5 North Atlantic 
Es will make peace 
Бар, не 74 70 4 70 67 3 
. 
bora, Sas ber i Cases on which the percentages were based are as follows: for pro-interviewers using answer 
79; anti 5 anti-interviewers using answer boxes, 66-68; pro-interviewers using verbatim recording, 330- 


"interview, А т ; 
viewers using verbatim recording, 62-64. 
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would have occurred by chance, compared with a probability of .30 
for the experienced. "These data are presented in Table 55. 

The latter findings are at variance with an earlier study reported in 
Cantril in which level of experience showed no relation to amount of 
over-all bias. However, one should note that his experiment differed in 
certain essential respects from the one here reported. The earlier study 
dealt with over-all amount of bias rather than bias introduced specifi- 


TABLE 54 
Tue EFFECT or Artitupe-Structure EXPECTATIONS Unper Two МЕтнор5 OF 
RECORDING 
CoxrixcENCY Co-EFFICIENTS 
Between Pais or ANSWERS 
ix Wuicn THE EXPERIMENTAL 
Question Was* 
Classified by | Recorded 
Interviewer Verbatim 
Respondent’s opinion on U.S. participation in world affairs 
and opinion about the North Atlantic Pact. . . . 24 23 
Respondent’s opinion оп the Marshall Plan and opinion on the 
amount to be spent on overseas did cs soe Hes жыка QUOS 59 56 
Respondent’s opinion on the North Atlantic Pact and his belief 
that it makes war or peace likely... -oo enn 9. 73 
т 482 to 


* The number of cases on which the co-efficients were based under pre-coded conditions ranged fro ied 
ranged from 473 to 537. Certain cells were not Ч 


522, whereas the number of cases for the verbatim conditions Д tae 
in this part of the analysis because of difficulty in interpreting what pairs of answers were indicative of expec be 
tion effects. Because these calculations were made on 2 x 2 tables, the co-efficients have been corrected for uH 
influence of broad categories. While the differences in the co-efficients under the two conditions are small, ae 
suggestive evidence in support of our hypothesis is afforded by the fact that the difference between the co~ ae 
cients under the two conditions increases in the hypothesized direction as the pair of attitudes becomes m 
closely associated, despite the fact that the reverse would be expected on grounds of sampling variance. 


TABLE 55 


Tur DirrerentiaL Errects or FiELD CLASSIFICATION Амохс EXPERIENCED 
INEXPERIENCED INTERVIEWERS 


AND 


Tur Рковлвилту THAT THE 
Овтліхер DIFFERENCES ШЫГ 
Ovzn-ALL RESULTS UxpER TW 
Mertuops oF RECORDING " 
Woutp Occur as ^ RESULT О 
SAMPLING FOR INTERVIEWERS 
Wuo ARE 


jenced 
Experienced Inexperience 


Attitude toward amount being spent on European recovery .60 a 
Awareness of North Atlantic Pact... 66 05 “al 
Attitude toward North Atlantic Pact... «ns 46 0 
Belief that North Atlantic Pact makes war likely or peace А 28 
7 а 
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cally in the classification process, and the interviewers defined as “in- 
experienced” had considerably more experience than those in the pres- 
ent study," 

If we postulate that interviewer effects in pre-coded questions arise 
as a function of the demand on the interviewer that he make “on-the- 
spot” judgments, it would seem to follow that such effects would be 
more frequent where the answers given by respondents are ambiguous. 
It has been previously pointed out that this is true for free-answer ma- 
ter ial; it would seem all the more likely to occur in pre-coded questions, 
since the alternative of merely writing down the verbatim responses is 
Not available to the interviewer and he must in all such cases make a 
judgment of some sort. It would follow then, that if by some accident 
of procedure we increased the frequency of responses which might 
prove difficult for the interviewer to classify, we would thereby in- 
Crease the likelihood that he introduces error through beliefs, desires 
and expectations which are activated as an aid in making the necessary 
Judgments, 

" Several studies provide data bearing on this hypothesis. In the study 
y „Саһајап and associates referred to in the foregoing, questions in 
which alternatives are only partially stated or in which an alternative 
жш in the question may be recorded seem to be channels for the 
fore we of bias.° It seems likely that such questions actually elicit 
fie iguous answers than questions of other types. : 
iin vus elaborate test of this hypothesis was provided by an experi- 
steel a nducted by NORC.“ The degree of ideological bias was meas- 
СОДА, under a condition which strongly increased the frequency of 
бедна, e or ambiguous answers and then under conditions which ге- 
ih bare answers. This was accomplished by changing one question 
Was ai € questionnaires, so that a frequently selected middle category 
мы: itted from the stated alternatives. Since this category was a 
sion PH Vae, d for unstructured opinions on the question, its omis- 
ambignon presumably leave the interviewer with a sizeable number of 
S responses which required classification. . 
мерт secured through this experiment were most revealing. It was 
in the за the form of the question where there was no ambiguity 
interviews. alternatives, differences between ideologically contrasted 
vhere tia were not significant, whereas under the second form— 
M is тве proportion of answers presented problems of classification 
With thee Vers tended to classify the ambiguous responses 1n accordance 
ih own ideology. The data are presented in Table 56. | 
question form had no relation to bias arising from the inter- 
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TABLE 56 


Distripution or Responses Unper Two Forms or THE SAME QUESTION ғов INTER- 
VIEWERS OF CONTRASTED IDEOLOGY 


Fors A Form B 

(ALTERNATIVE Оміттер) (ALTERNATIVE INCLUDED) 
Амохс Interviewers HoLDING AMONG Interviewers HOLDING 

Majority Minority Majority Minority 

Opinion Opinion Opinion Opinion 

Percentage of Respondents Percentage of Respondents 

Answering Answering 

Less likely (majority) ...... 55 40 42 gl 
More likely (minority) ..... 19 30 18 22 
No difference. sssi scs sesa 18 9 32 27 
Dont know. «2 iras pine nore e 8 21 8 10 
100 100 100 100 

N = 250 N = 88 N = 249 N= 8 


viewer's own ideology, we would expect differences between the dis- 
tributions secured by interviewers of contrasted ideology to be about 
the same under both forms. However, if such bias were more operative 
under one form than the other, we would find greater differences be- 
tween contrasted interviewers under that form. In the foregoing com" 
parison of the two question forms, the reader can see that differences 
between the distributions of the two interviewer groups are in the same 
direction in both forms but are considerably greater under Form 
than under Form B. Testing these differences by the Chi-square 
method, we find that under Form B the differences are not significant, 
while under Form A they are significant at the .01 level. Here, then, а 
evidence that the form of the question affects the degree of bias intro" 
duced by virtue of the interviewer's ideology. Under the question form 
which omitted the “no difference” alternative, ideologically pcm 
interviewers got significantly different results, whereas under the othe 
form they did not. . 

Detailed data presented in the original report also reveal that ed 
viewer effects deriving from ideological factors may operate 1n d 
ent ways for different ideological groups. It was found that interview 
holding the “majority” political view exerted their bias by an ern 
of the category in which they themselves would have responded, wh M 
those in a “minority” position biased answers by an inflation of t 
"don't know" category. i 

If the results secured here have any generality, they throw za 
what new light on past suggestions for controlling interviewer € i 


a some- 
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For example, Cantril implicitly assumed that ideological bias works iz 


the same way for interviewers of contrasted ideologies when he recom- 
mended: 


Although interviewer bias exists, by and large the biases in one direction 
cancel those in the opposite direction, so that the overall percentage of 
Opinion is not likely to be significantly wrong. . . . If an investigator wants 
to minimize interviewer bias, he should choose an equal number of inter- 
viewers who are biased in different directions.™ 


Were we to follow Cantril’s prescription in the use of question Form 
A in the foregoing, it is obvious that the biases would hardly “cancel 
themselves out.” While the majority category is unduly inflated by the 
Majority interviewers, the minority interviewers express their effects 
Mainly through inflating the “don’t know” and therefore do not inflate 
the specific minority category in a balancing fashion. In other words, a 
net shift of the distribution toward the explicit majority position would 
unquestionably take place. 

Although we have no empirical evidence as to why bias works in 
Such different fashions for the two groups of ideologically opposed 
Hterviewers, certain conjectures can be advanced as possible explana- 
ons of the phenomenon. First of all, the experimental literature gives 
emple evidence that the perception of scale values differs for different 
Individuals, and that such perceptions vary with cultural, personal- 
Же and situational factors." Therefore, it would seem likely that 
ten 4 , uals with such different viewpoints as the majority and minority 
cate ewer would be likely to perceive the significance of the scale 

Bories in strikingly different ways. . 

sas P even if the opposed groups of interviewers were equally moti- 
eQ to bias responses in conformity with their own ideology, it is 
he conceivable that the majority interviewers might perceive only 
ан to, category as agreement with their position. By contrast, the 
majorit. interviewers might perceive all the categories, other than the 
ment 2 one, as agreement. Merely in terms of the relativity of judg- 
againg “ нз interviewer who knows that the majority of people are 
spon dene might regard it as a considerable victory to find any re- 
ailin who even goes so far as to question the validity of the pre- 
the o di even if the respondent does not completely espouse 
ight a viewpoint, They are not completely against him and 
ina tiat e won over." The interviewer who is characteristically 
against him Y position lives in a hostile world, with the odds stacked 
about the т" and anyone is welcomed who even indicates mild doubts 
Prevailing position. Thus, in a sense, our minority interviewer 
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might see the “don’t know” category quite differently from our ma- 
jority interviewer. Interpreting it as a vote against the majority, it 
might serve him as a satisfactory category for the disposition of doubt- 
ful answers. 

Moreover, if we conjecture about one further element of the situa- 
tion, the finding that the minority interviewer does bias the responses 
by inflating the “don’t know” category becomes even more under- 
standable. In earlier chapters, it was noted that in addition to expecta- 
tions arising from the respondent’s attitude structure as revealed by 
cues in the interview, or from his group membership, interviewers have 
expectations about the attitude azy respondent would probably have, 
on the basis of estimates of the prevailing sentiment on well-known 
issues. 

We assume therefore, that both the majority and minority inter- 
viewers initially approach any given respondent with the expectation 
that he will probably take the majority view on an issue. What happens 
when the respondent gives an uncertain or "biasable" answer? The 
majority interviewer tends to “press” the uncertain answer into the 
majority category because, in him, expectation and desire coincide. 
The minority interviewer, however, is subject to cross-pressures. On 
the one hand, he expects a majority answer and, on the other hand, his 
ideology motivates him to desire a minority answer. To “press” this 
doubtful answer into the minority category is to depart a considerable 
distance down the scale from his prior expectation. The “don’t know 
category, however, is a lesser distance down the scale from his prior 
expectation in the direction of his ideology. Since, as we have already 
suggested, the minority interviewer perceives this category as H 
agreement with his ideology, he can resolve these cross-pressures P. 
assimilating answers into the “don’t know” category and still satis!) 
whatever drive exists to inflate the percentage “on his side." 

If the findings of this one experiment, plus the conjectural epora 
tion, are substantiated by further research, they will have € 
implications for the interpretation of survey results. If this kin = 
differential manifestation of bias for majority versus minority ед 
viewers occurs regularly in such situations, poll results for = nra 
types will be systematically biased toward the majority aus iare Jn : 
especially on issues in which the prevailing sentiment 15 с ear-c d 
well-known to interviewers. Since many questions now in common Fi 
are prone to such ambiguous responses, a false picture of public sen 


ments may often be presented. | е" 
Further research is needed to substantiate the theory discusse 
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foregoing. For example, experiments parallel to the one here reported, 
on issues where interviewers have no preconceptions about the prevail- 
ing viewpoint, would be instructive. If no such differential manifesta- 
tions of bias occurred under these conditions, it would lend support to 
our speculations regarding the influence of expectations in producing 
such effects and would indicate within what domain such errors in 
interpretation are present. 


Effects Arising from Increased Opportunity for Expectational Processes 


In an earlier chapter, we have described expectational processes 
which lead to bias. While these sources of interviewer effect are latent 
in every interviewing situation, it is clear that the degree to which they 
are operative may be in part a function of the situation itself. A brief 
Consideration of the situational facilitators of these biasing processes is 
given below, with some experimental demonstrations of specific situa- 
tional effects, 

Role effects.—In some kinds of interviewing situations, it is difficult 
for role expectations to operate. If the respondents are a homogeneous 
group, whose characteristics as individuals cannot be estimated by the 
Interviewer on the basis of their appearance or manner, role effects 
Would be minimal, Conversely, where there is wide disparity between 
Individuals in the sample, we would expect an increased possibility of 
Tole effects, Likewise, where the individual is interviewed “in context” 
—Such as his own home—it is possible that the characteristics of the 
home might be used by the interviewer as an aid in forming judgments 
about the responses of the individual. 

Questions whose content is “role-linked” will certainly be more con- 

Ucive to the operation of role effects. Thus the situational factor of 
question content may act to inhibit or heighten role expectations. The 
Study by Feldman, reported in Chapter III, bears this out. As previously 
noted, these tests were made on a series of questions dealing with the 
Purchase of various items by the respondent, almost always a woman, 
and by the spouse, generally the husband. 

In the earlier discussion of these findings, support was adduced for 
the view that the significant differences obtained on the questions about 
8asoline, automobile repairs, housefurnishings and clothing by the 
Matched interviewers was due to the relative "proneness" of given inter- 
aa to expectations about the normal buying roles of husbands and 
However, what was not emphasized in the earlier treatment was the 


a : А 
Ct that on those items whose purchase is not thought of as the role of 
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a particular sex, there were no significant differences between pairs of 
interviewers for reports about purchases either by the respondent her- 
self or her spouse. It will be recalled from discussion earlier in this 
chapter that interviewer effects may be represented in fairly uniform 
distortions of data among all interviewers, or they may be manifest as 
variations among interviewers resulting from individual differences. 
Apparently there is no significant variation among interviewers on these 
items, because there is no particular problem of “role linkage" for as- 
pects of purchasing-behavior for such items as drugs or hardware, or 
such services as banking, dentistry, and entertainment. 

Apart from question content as a situational determinant of role ef- 
fects, the Feldman findings also provided some evidence that other 
formal features of questionnaire design facilitated role effects. Data 
were presented in Chapter III to show that the presence of a question 
early in the questionnaire “tipped off” the interviewer to certain char- 
acteristics of the respondent and affected his handling of the subsequent 
questions on purchasing-behavior. While such processes are normally 
subsumed in our theory under “attitude-structure expectations,” in this 
instance the prior question altered the belief of the interviewer about 
the roles of the husband and wife. Thus, the evidence has relevance to 
the discussion of role effects, and the influence of questionnaire design 
on such effects. 

Attitude-structure effects—Like role effects, attitude-structure ef- 
fects may be increased by situational factors, An “interlocking” ques- 
tionnaire, or one in which the questions are related to the same general 
area of opinion, facilitates effects by providing the interviewer many 
cues about the respondent’s attitude structures. Thus this kind of ques- 
tionnaire would be expected to induce greater effects of this nature 
than one in which questions asked have no presumptive attitudinal rela- 
tion to each other. 

One specific situational factor affecting attitude-structure expecta- 
tions was studied in the experiment of Smith and Hyman. In this test, 
the order in which interviews were collected was the situational variable 
tested.” It is possible to separate those subjects who heard the a! 
which simulated the “ignorant” respondent initially from those d 
jects who heard that respondent only after they had been exposed to p" 
markedly contrasting "intelligent" respondent. That the application а 
subjects to orders was fairly random is illustrated by the fact os t 
mean age and sex distribution of the two subgroups were identica . " 

This situational factor of order of interviewing carries with it the 
likelihood that the contrast experienced between successive respondents 
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enhances the perception of their respective attitude structures. The 
incidence of expectational sources of error may therefore not be purely 
a function of the proneness of the interviewer, but of the accident of 
the sequence of interviewing. That such factors actually operate is 
shown in Table 57. In the five instances presented, and in three other 
tests, the results uniformly demonstrate that the effect of attitude- 
Structure expectations is enhanced by the contrast experienced as a 
result of the specific situational factor of sequence of interviewing. 


TABLE 57 


THE ASSIMILATION oF EQUIVOCAL Answers INTO AN “IGNORANT ISOLATIONIST” ATTI- 
TUDE-EXPECTATION STRUCTURE OR “INTELLIGENT INTERNATIONALIST” STRUCTURE AS 
RELATED TO THE SITUATIONAL FACTOR OF CONTRAST 


Susjects Уно HEARD THE 
ISOLATIONIST TRANSCRIPTION 


After 


Initially Internationalist 


Proportion of subjects coding the Isolationist respondent as 
taking no interest in U.S. policy toward Ѕраіп.......... None of the | 4 of the 

M 9 subjects | 8 subjects 
сап rating on respondent's attitude toward international 


affairs (rating of “5” indicates maximum isolationism) . . . 3.8 4.8 
jm rating on respondenc's interest in international affairs 
‹ : 
rating of “3” means no їпїегезї)..................... 2.56 3.0 


Sunyects Мно HEARD THE 
InTERNATIONALIST TRANSCRIPTION 


After 


Initially Tsolationist 


Pr 3 à 5 "man 
Oportion of subjects coding the Internationalist respondent 


as “Approving amount U.S. is spending on European re- 
uc qe : p ACORDE р а ‚| 4ofthe 8 of the 


Ме 8 subjects | 9 subjects 
an rati " : 
ae on respondent’s attitude toward international 

airs (rating of “1” means maximum interventionism) . . 1.63 1.56 


ыр jede effects.—Particular situations may give rise to some be- 

S to the probable distribution of opinion among the population. 

or example, probability effects could occur after some interviews had 

ar creed by any one interviewer. In such a case he might, in the 

dina of his initial experience, develop some idea about the probable 

a of sentiments, Thus the number or sequence of interviews 

fine cted by а given interviewer on the particular survey might affect 
Peration of this source of bias. 

uch а theory is difficult to test empirically and we have no sub- 
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stantial evidence on the problem. However, a suggestive demonstration 
of this phenomenon is available as a by-product of a study conducted 
by Curtis Publishing Company." In one study of magazine readership, 
the material used for "confusion control” purposes was repeated in 
successive surveys. (In the use of this technique, interviewers are not 
informed that the control material has never appeared in magazine 
form.) Since the samples used in the successive surveys were equivalent, 
one would expect that each sample would contain approximately the 
same per cent of respondents who claim to have read the nonexistent 
magazine material each time. The actual results obtained on the repeated 


studies is presented in Table 58. 
TABLE 58 


CHANGE IN THE Proportion or Reapers ОЕ NONEXISTENT MAGAZINE CONTENT IN 
Successive SURVEYS 


Percentace or “READERS” 


Ехнівітѕ Usep: т 
18: Тіте 2d Time 3d Time 4th Time 
o 
4 times 
12.4 10.6 11.3 9.3 
13.1 16.4 10.4 
9.0 9.9 5.1. 
17.6 16.4 14.7 
9.3 6.4 Tod 
13.3 8.6 11.0 
12.6 11.9 
9.4 9.0 
5.5 8.3 
24.2 20.0 
18.7 7.5 


the average number 


1 is used again. In the 
ne in 


It may be seen from inspection that in general 
claiming readership declines as the control materia 1 
eighteen comparisons, we find that in twelve cases there isa decli aia 
the proportion identifying the material, and in only six cases is Lue 
an increase in this proportion. Moreover, the total net decline is 250 
three times as great as the total net increase. ^ Р 

Since one would expect only slight random variations due to м 
pling, the most logical explanation for the results secured in this E pi 
is that probability expectations were operating among intentie ы d 
they used the material, they came increasingly to expect that resp 
ents would not indicate readership. 


CHAPTER VI 


Interviewer Effects Under Normal 
Operating Conditions 


_In the previous chapters, we have demonstrated how and why inter- 
viewers may distort survey results under certain specific or relatively 
simple conditions, but we have thus far presented little data bearing on 
the magnitude of such distortion in the course of normal survey opera- 
tions. 

Some of the evidence presented in Chapter III, for example, was 
based on laboratory-like studies. The findings of these studies con- 
tribute greatly to our understanding of a given process or component 
of interviewer effect in isolation from the many other factors that 
Operate simultaneously with them in actual field situations. But they do 
not enable us to infer the extent of distortion under the complicated 
conditions of a field survey, since we cannot analyze fully enough the 
actual situation into its components and their interactions. 

Other evidence presented in Chapters III and IV was derived within 
а field situation of a complex nature. However, our generalizations 
about the extent of distortion in normal operations are again hindered, 
Since we concentrated our discussion on a specific determinant of inter- 
Viewer effect and abstracted that factor from the total array of factors. 
"Xpectations, group membership, ideology, and the like, all operate 
simultaneously, While understanding is increased by the analysis of 
these factors separately, it is also important to study their combined 
effects and to find out how frequently and to what extent these effects 
area problem in practical field operations. When one considers, further, 
ee presented in Chapter V that the effects of these distorting 

Н гу with a host of minor situational factors and realizes that 
Ыш studies have been based оп a limited range of situations, it is 

Саг that there is а need for observing these effects over many studies. 
"ir these reasons, we must observe interviewer effect under a wide 
ia | m complex operating conditions in order to evaluate its normal 
vin. a this chapter, we shall present the relatively small body of data 
баар as specially gathered under conditions appropriate to such 
Баз бта, We will supplement these limited data by review of the 
Hine in an attempt to improve our estimate of the extent of 

er effects. 
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Before examining the empirical findings, it is well to distinguish sev- 
eral different classes of measurements of interviewer effects. These 
classes cannot be rigorously defined here, but even a cursory considera- 
tion of them enables us roughly to place our empirical work in the 
perspective of the total problem. Three such classes of measurements 


will be treated here. 
1. GROSS EFFECTS 


Strictly speaking, interviewer distortion exists whenever there is any 
deviation from the “true” response (defined in terms of the purposes 
of the study) in the response elicited and recorded by the interviewer 
for a given respondent to a given question. Gross interviewer effect 
over an entire survey may then be defined as a function of the total 
number of such individual deviations (each deviation weighted ideally 
by the degree to which it distorts the conclusions reached by the re- 
search). It is obvious that in order to measure interviewer effect on 
this level it is necessary to have a validity criterion—some conception of 
what the “true” response for a given respondent to a given question 15. 
Since any such validity criterion for attitude or opinion questions 15 
rarely, if ever, available and the criterion data for questions of fact or 
behavior are seldom obtainable even when such data do exist, the meas- 
urement of gross interviewer effect in this strict sense is seriously limited 
even though it would be extremely desirable. . 

Certain approximations to the measurement of gross interviewer ef- 
fect may, however, be more feasible. For instance, one can prescribe a 
given set of interviewing techniques as minimizing distortion (е.5., е 
interviewer should not use loaded probes, the interviewer should recor 
exactly what the respondent says). Then, by direct observation or by 
some sort of mechanical recording of the total interview, one cou 
measure the degree to which the interviewing prescriptions is 
broken. Ideally, neither the interviewer nor the respondent should A 
aware that his behavior is being either directly observed or n 
but this condition has to our knowledge only rarely been met. Stili, 
some sort of compromise where one or both parties are aware of DEDE 
observed might still throw some light on the extent of gross interview 


effect, assuming that our prescriptions of “proper” interviewing tech 
nique are in line with our goals.” 2 es ule 

Another conceivable way of gaining some insight into the po ee 
extent of gross interviewer distortion is through having each up "d 
answer the same questions or discuss the same subject matter thro » ч 
several different media—for instance, through a self-administered qu 
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tionnaire and a personal interview. The discrepancies in the responses 
gathered for each respondent for each question or subject matter 
through the different approaches are examined. The central difficulty 
with this approach is that it is almost impossible to determine in any 
specific instance which of the two responses, the oral or the written, is 
the more nearly valid. There is also the possibility that in many in- 
stances, when the two responses differ or even when they are the same, 
both responses are invalid. 

The suggested technique could also be used by having each respond- 
ent interviewed by two or more interviewers using the same interview 
schedule. If one makes some assumption as to the relative skills of the 
interviewers, the superior one can be regarded as a criterion interviewer 
against which gross effects can be evaluated. Such an assumption may 

be warranted, under conditions where specially trained or highly pro- 
fessional personnel are used as check interviewers as in the Census 
quality check procedure. This technique has essentially the same short- 
comings as the foregoing, but with even more danger that constant 
distortions, those common to all interviewers, will be obscured. Conse- 
quently, estimates derived from such an approach, at best, set a lower 
limit on the true extent of gross effects. 

Another approximation to the measurement of gross effect involves 
че use of “sleeper questions”—that is, questions for which certain 
Pies by definition, are invalid. This would be the case, for example, 
© an answer by a respondent that he had read a nonexistent magazine. 

uch items are readily constructed and easily applicable to most surveys. 

heir use as measures of gross effects has not been sufficiently explored, 
although it must be realized that there is some limitation in generalizing 
about the magnitude of effects on other characteristics from the find- 
1165 on bizarre, nonexistent items. 

It Should be noted that all these techniques are extremely difficult to 
= in the natural field-setting. Even if the co-operation of the respond- 
е be obtained, the very attempt to record an interview with a 
mune or have the same respondent interviewed with the same 
n ule several times may in itself make the situation so unlike the 
Ric field-setting that the findings would tell us relatively little 
condition magnitude of gross interviewer effect спане ыен е 

alle ee oe The entire problem of the measurement of gross effect thus 
thoughr ег the Principle of Indeterminacy, and thus far, no one has 
hw sm of an approach that makes the act of measurement itself in- 
song] m шо the field situation we are trying to measure. Only осса- 

1е5 attempting to measure the extent of gross interviewer 
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effects are reported in this chapter. Only a limited number have been 
conducted, and most of those that have been made were done under 
conditions hardly comparable to normal field conditions. At this point, 
we can merely hope that some day the necessary resources to make 
further advances in the study of gross effects will become available. 

It should be noted that the concept of gross interviewer effect de- 
fined in this section, by implication, attributes to the interviewer or the 
interviewing process all invalidity in interview material. For some pur- 
poses it might be desirable to distinguish between irremediable invalid- 
ity; i.e., invalidity which could not be remedied by any change in inter- 
viewing technique or interviewer characteristics—for example, that due 
solely to the respondent—and invalidity which could conceivably be 
removed by the alteration of some controllable element of the inter- 
view situation. A design appropriate to this problem would combine the 
use of criterion data of validity with the assignment of interpenetrating 
samples to classes of interviewers. Then the differential level of validity 
could be examined to determine the influence of the interviewer factor 
on gross effects. For other purposes, it would be well to distinguish be- 
tween invalidity that would remain if the most feasible alternative 
method to the personal interview were used to gather the requisite data 
and the excess invalidity due to the use of the personal interview. A 
design appropriate to this problem would involve comparison of re- 
sults for different enumeration procedures by reference to criterion 
data. Such hypothetical alternative formulations point to the fact that 
the degree to which gross effect need concern us depends on whether 
it can be remedied, whether there are other means of gathering data 
which would enable us to reduce or eliminate it, and whether it affects 


the over-all findings of the study. 
2. NET EFFECTS 

Net effects may be defined as the difference between the distribution 
of responses obtained by one or more interviewers to one or mone e 
tions from a given population of respondents and the “true distri ad 
tion of responses to that question or questions for that population. He 
distortions in opposite directions may conceivably cancel each 
so that, even though the responses of particular respondents have aa 
distorted, there is no net distortion in the marginal distribution or e 
in cross-tabulations. This level of measurement is, of course, very 1 
ferent from gross effect, where all distortions of the individas Ed 
sponses of individual respondents are always considered as сита 
and never as canceling out.* 


other 
been 
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Net effects can be calculated relative to any body of data in the 
survey. They can be determined for the total group of interviewers 
and the total sample of respondents or for a subgroup of interviewers 
and/or a subgroup of respondents, or even for one interviewer and his 
respondents. The errors are simply determined for whatever is the 
group under investigation. Obviously, net effects can occur relative to 
any or all possible groupings of the data. From a practical point of 
view, the particular net effects that should be our central concern are 
those occurring at the specific level of cross-tabulation most crucial 
to the survey. 

The problems of measurement discussed in connection with gross 
effects also arise here. However, while we would again be plagued by 
the problem of what the “true” responses for our given purpose are, 
11 cases where we have defined such "true" responses, it should be 
simpler to obtain the distribution of these responses (e.g., from records 
Or other Sources) than it would be to obtain the individual true re- 
Sponses. That this is so is clearly indicated by the past literature. As will 
be seen below, the number of direct studies of gross effects is very few, 
Whereas there have been innumerable studies of net effects. In a certain 
Sense, the many election prediction studies approximate to measure- 
Ment of net effects, Other usual examples involve the comparison of 
Survey results with aggregate records (the distribution of true re- 
Sponses) of bond purchases, sales of commodities, etc., for a given pop- 
Wation, which are readily available in the files of government or in- 
dustry, 

While there are many such studies, they are confined mainly to the 
determination of net effects on the marginals for the entire sample of 
respondents interviewed by all interviewers. This is no doubt due to the 
Availability of criterion records only in this limited form. In the light of 
Sur remarks that net effects at some higher level of cross-tabulation 
may be most important, the general unavailability of the refined sta- 
Ustical distribution of the criterion data puts serious limitations on the 
Practical value of such past literature. It not only limits us in qualify- 
Ing the accuracy of specific findings; it also prevents us from drawing 
Inferences as to the origin of net effects. 

n approximation to the measurement of net effect can be made by 
ving either the same group of respondents or different random sam- 
ples of respondents from a single universe investigated by personal 
interview and by some other means, and then comparing the distribu- 
consc responses obtained by the different means. In practice, it is of 

iffcult to say definitely which of the distributions—the one 
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obtained by interview or the one obtained by other means—approxi- 
mates more closely the “true” distribution, although the investigator 
may often be reasonably certain that one of them is superior.* 

Another approach to net effects involves having either the same re- 
spondents or different random samples of respondents from a single 
universe interviewed by different interviewers using the same schedule, 
and then comparing the distributions obtained by the different inter- 
viewers. This approach is again severely limited by the impossibility of 
determining which of the interviewers is getting the more nearly valid 
responses, and by the possibility that even when several interviewers get 
similar distributions they have all merely distorted responses in the 
same direction. But whenever significant variation among the distribu- 
tions of responses obtained by different interviewers is found, we can 
be sure that at least some of the interviewers are introducing distortion. 
Also, in instances when most of the interviewers get quite similar dis- 
tributions of responses and one or two interviewers get radically differ- 
ent results, it is often assumed that the interviewers getting the more 
common results are getting the more nearly valid results while the 
deviant interviewers are distorting their findings more." There are, also, 
occasional situations where we have certain more or less a priori be- 
liefs concerning the way people behave in the interview situation, ОП 
the basis of which we judge which of the response distributions is more 
nearly valid. For instance, we can assume that certain interviewers, per- 
haps the regular staff supervisers, are highly skilled in eliciting what for 
our purposes are valid responses, particularly if they use a certain type 
of interview schedule and procedure; the responses elicited by them 
can then be used as the criterion distribution against which to compar? 
the work of other interviewers using equivalent samples of respondente. 
Or our knowledge—or a priori belief—as to the nature of respondent 
opinions can be used to decide which of several distributions of si 
sponses is most nearly accurate. Or it can be assumed that an сим 
viewer with characteristics similar to those of his respondents will o 
tain reasonably valid responses from these respondents. . — 

In the following discussion, studies in which different us ign 
interview samples of respondents from the same universe so cea М 
distributions of responses can be compared without any partic bol 
criterion distribution in mind will be referred to as studies of diff h 
ential net effects. Studies of this sort are extremely common. AE 4 
they are designed to determine the degree to which eee Ше 
tort responses, they gencrally ignore biases that are constant OV I 
entire staff of interviewers.’ They are justified by two main argum 
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First, much of public opinion research is devoted to the determination 
of certain functional relations among the data rather than to precise 
description of the data by marginal distributions. A complete determi- 
nation of interviewer effect upon a marginal distribution requires a 
knowledge of the net interviewer effect and, hence, of the true or 
criterion value. But if the effect of each interviewer on the response of 
every respondent is exactly the same (in magnitude and direction), the 
“distance” between the responses of any two individuals would be the 
same as if the responses were completely accurate, and correlations 
(which depend upon the distances between individuals) would be un- 
affected. It is, then, the differences among interviewers in their effect 
Оп responses that distort measures of relationship. Thus, to determine 
the interviewer effect on a correlation, we need to know only the dif- 
ferential net effect (the difference of an interviewer's results from the 
average for all interviewers) and not the absolute net effect (the dif- 
ference of an interviewer’s results from the true values). It is just the 
biases that are not constant which must be discovered and taken into 
account, 

This argument, though abstract, does at least justify the study of 
differential net effects even in cases in which criterion distributions are 
not available and in which, therefore, the amount of bias in the margin- 
als cannot be ascertained. 

‚ А second reason for special concern over differential net effects is the 
likelihood that the differential effects are those that are most subject 
to remedy. If some interviewers are known to do a better job than 
others, i.e., make either no errors or fewer errors of certain types than 
do other interviewers, then it should be possible to bring the worse 
interviewers up toward the level of the better interviewers, or we could 
at least improve the general level of interviewing through selective hir- 
Ing practices, But errors common to all interviewers somehow appear 
to be less subject to correction because it is not yet clear that it is hu- 
Manly possible to do better. While this generalization about the relation 

Stween differentiation and mutability might not hold universally, it 
Seems well warranted in the light of our body of findings. Systematic 
effects of the expectational sort described in Chapter Ш seem firmly 
Srounded in fundamental cognitive processes. Systematic effects de- 
“ving from group membership factors described in Chapter IV seem 

"mly grounded due to the current economics of the interviewer labor 
ogni Thus to focus on differential net effects is most relevant and 

mediately practical. 

Studies of differential net effects and/or of inter-interviewer varia- 
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tion are by far the easiest kind to make under operating field condi- 
tions. They can often be made at relatively little added expense as a 
by-product of a survey carried out for substantive purposes. In fact, if 
one ignores the important restriction that the samples of respondents 
interviewed by different interviewers be random samples from one uni- 
verse (or that at least the variation between samples due to non-inter- 
viewer factors be known), studies of this general type can be done 
practically at will any time a survey is made. It is somewhat question- 
able, however, whether this type of study omitting controls over re- 
spondent factors is a desirable way of examining differential net inter- 


viewer effect. 


3. INTER-INTERVIEWER VARIATION 


Fundamental to the definition of inter-interviewer variation is a con- 
cept of a universe of interviewers. Then, in order to evaluate inter- 
viewer effects, we compare the distribution of responses actually ob- 
tained with the hypothetical distribution of responses that would be ob- 
tained from a given population if all the interviewers in the universe of 
interviewers were to interview all the respondents. Thus, there 1s no 
concern here, as there is in the case of gross and net effects, with the 
validity or truth of either individual responses or of a distribution of 
responses. 

Inter-interviewer variation is the variation of the distributions of re- 
sponses obtained by the different interviewers around the hypothetical 
distribution described in the foregoing. This variation is readily esti- 
mated Бу a design such as the one described under net effects, where 
different interviewers interview random samples from the same popu- 
lation of respondents and the distributions of responses thus obtaine 
are compared with each other. . de- 

While the goal of studies of gross and net effect is to reduce thes е 
gree of invalidity in surveys ог at least to determine means of — 
that invalidity into account in interpreting survey results, the purpo 
of studies of inter-interviewer variation is to enable us to take into ped 
count an additional component of sampling variance when we at 
confidence intervals around estimates from survey data. This addition?" 
component of sampling variance is due to the fact that on any к< 
Jar survey we are using only a sample from the universe of intervie € 
Of course, the simple estimate of inter-interviewer variation е н сд 
erally not the final goal of these studies. Almost all of ино one 
determine ways of efficiently diminishing the contribution О h roa 
viewer variance to over-all sampling variance either through study 
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sign (e.g., determining the optimum number of interviewers to be used 
for a given sample design) or through interviewer hiring or training 
policy. 

One serious difficulty underlying this approach is that the variance 
might sometimes be minimized around a distorted distribution (ie, a 
hypothetical distribution different from the criterion distribution) if 
the vast majority of the universe of interviewers tended to get invalid 
Tesponses. This qualification may be somewhat academic in the in- 
stance where there is no clear formulation or measure of what is a valid 
Tesponse. It might also overstate the danger, since it is unlikely that 
competent research workers would knowingly concentrate on the 
problem of reducing variance to the exclusion of the problem of bias. 
For instance, if it were found that only about half of an interview- 
Ing staff could benefit from training so that training tended to increase 
the differentiation in the quality of work between interviewers, it seems 
Inconceivable that as a consequence of this anyone would forego train- 
Ing entirely in order to keep sampling error at a minimum. Thus, at the 
Present, the devotion of resources to the reduction of interviewer 
Variance is a reasonable course of action. . . 

It should be noted here that the published papers on inter-interviewer 
Variability that have come to our attention give at least token reference 
to the problem of validity. But the empirical sections of these papers 
Usually ignore the problem of validity and devote themselves com- 
pletely to variability. 


4. STUDIES OF GROSS EFFECT 


А As was indicated earlier, there has been a paucity of studies of gross 
interviewer effects, On the basis of careful review of these studies the 
only clear conclusion is that gross effects assume no typical value but 
Tange widely depending on the specific study cited and the character- 
‘Stic evaluated.: None of the past studies is directly informative on our 
Current need for evidence on the influence of the interviewer on the 
“vel of validity of the data. Moreover, the character of the field staff 
Which obtained the given findings is rarely indicated. Consequently, 
there is not even any inferential basis for relating variations in gross 
сест to given classes of interviewers over the total range of past 
Studies, 
: The one major study designed to measure gross effect directly and 
9 elut these effects todoterviewer performance was the Opinion Re- 
Search Center study in Denver in the Spring of 1949." In this study, the 


indivi . г 
dividual responses to a number of factual questions were validated 
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against official records. To questions concerning | the possession of a 
library card, driver’s license, and automobile (as well as the year and 
make of the automobile for owners), between 10 and 15 per cent of the 
respondents gave invalid responses. To questions concerning home 
ownership and the possession of a telephone, fewer than 5 per cent 
gave invalid answers. To a question concerning the age of the respond- 
ent, somewhere around 10 per cent of the answers were probably in- 
valid. Far higher estimates were reported for the proportion of re- 
spondents giving invalid replies to a number of questions concerning 
whether or not the respondent voted in a series of elections or con- 
tributed to a community chest, but since the validity of the criterion 
records obtained in these cases is subject to some doubt, full reliance 
cannot be placed on these particular findings. 

These data alone do not permit us to say exactly what portion of total 
invalidity can be ascribed to interviewer effect. But, if it could be 
shown that interviewers varied significantly in the proportion of in- 
valid answers they elicited, then we could be certain that at least part 
of the over-all invalidity is due in a sense to some characteristics or be- 
havior of the interviewer, or at least we could be sure of this for those 
interviewers who got the larger proportions of invalid responses. The 
statistical significance of the variation between interviewers in the 
proportion of respondents giving invalid responses was testable in this 
study since, in each of five sectors of Denver, each of nine interviewers 
was assigned a random sample of the respondents in his sector. Chi- 
squared tests of the significance of the inter-interviewer variation 
within sectors were made and cumulated over the five sectors. e 
tests failed to indicate any significant variation in validity among t i 
forty-five interviewers. But, three other apparently more powertu 
tests did tend to show that there were actually real differences (€ 
interviewers in the degree to which they reported invalid responses ir 
their respondents." First of all, there were positive eee 
(the median value of the intercorrelations was +.39) between ые 
proportions of invalid responses for a given interviewer for differe 

uestions.! J 

" Further support from the same study for the existence of on age 
between interviewers may be found from the fact that mem fine 
certain classes of interviewers tended to get higher proportions О сей 
valid responses than did the members of other classes. qmd i 
interviewers and interviewers whose performance on a mp 
cording test indicated a tendency to allow attitude-structure ne 
tions to distort their recording of responses tended to obtain rela 
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high proportions of invalid responses. These findings make it appear 
very likely that some of the interviewers were responsible for at least 
some of the invalidity found in the survey. 

We have thus far demonstrated that in the Denver study, a survey 
conducted under more or less normal field conditions, gross interviewer 
effects did occur. But, this particular type of study yields little direct 
information about the process through which this distortion occurred. 
Information of this latter type is best gathered through direct observa- 
Чоп of interviews. But, as was pointed out earlier in this chapter, it 
would be extremely difficult to record a normal field interview without 
the knowledge of either the interviewer or the respondent. The closest 
approximations we have to this direct observation are two studies 
Where wire or tape recordings were made of interviews between 
“planted” respondents and interviewers who were unaware of the 
Plant.” In each of these studies, interviewers were given normal as- 
Signments including a number of randomly selected respondents as well 
25 one or more respondents with whom it had previously been arranged 
that they answer questions in specified fashion in the interview. The 
interviewers were not aware that they were working on anything but 
а normal assignment, that any of the respondents were in any respect 

Planted,” or that any of the interviews were being mechanically re- 
Corded, Thus, we here have controlled observations of interviewer be- 
avior, since each respondent’s behavior was essentially the same for 
Sach interviewer that interviewed him. This very stability of behavior 
Оп the part of the respondents, their failure to react spontaneously to 
the interviewer and be “affected” by him, does make the experiments 
rather unnatural, but they nevertheless yield some notion of the extent 
to which interviewers commit acts that are likely to produce bias in 
Interviews, 

The first of these studies was made by Lester Guest." In his study, 
in сщ college student interviewers with varying degrees of interview- 

8 experience all interviewed the same planted” respondent. The 
respondent attempted to give, in so far as possible, the same responses to 
te the interviewers. The responses to different questions Were prear- 

nged to vary considerably in the degree of ingenuity in probing re- 
Wired on the part of the interviewer in order to elicit a full, codable 
answer from the respondent. 

Titeria for 4 “good” interview were established, and the wire re- 
eee and completed schedule for the “planted” interview of each 
teq Wer Were scored for errors in terms of the criteria. The most 

quent errors were all basically in the area of inadequate probing and 
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recording of free responses. There were fifty-three instances where 
interviewers failed to record “side comments" or left out parts of a free 
response which were needed for the proper interpretation of what the 
respondent said. In sixty-six instances, interviewers failed to probe 
responses that were either vague, evasive, irrelevant, or general. In fact, 
in nineteen instances where the response was evasive, the interviewer 
circled a pre-code as if the question had actually been answered, In 
nineteen instances, also, the interviewers failed to probe for additional 
answers to a question where multiple answers were supposed to be 
elicited, and in twelve instances, “don’t know” responses were not 
probed at all. Another frequent error was of a more or less clerical 
nature; the interviewers had been instructed to distinguish probed from 
unprobed answers, but they failed to do so in forty-one instances. A 
variety of other errors like utter fabrication of responses, changing of 
respondent’s terminology in recording the response, changes in ques- 
tion wordings, and the introduction of the interviewer’s own com- 
ments, ideas, and suggested answers all occurred with generally rela- 
tively smaller frequencies than did the probing and recording failures. 
Of course, it is difficult to evaluate these comparative findings without 
some idea of the number of opportunities available to the interviewer 
for making each type of error and some weighting of the errors In 
terms of the degree of resultant distortion. Nevertheless, the results 
show clearly that interviewers do commit certain errors which Un 
questionably lead to a distorted representation of the opinions 0! 
knowledge held by particular respondents. б 
Additional evidence from а laboratory-like study supports the Gues 
findings that the locus of gross effects is frequently in the area oe 
adequate probing behavior. In this experiment sixty-one interview ези 
on NORC's permanent field staff were sent questionnaires on which ж) 
verbatim answers to open-ended questions had already been recorde ө 
They were told that these interviews had been obtained by other ved 
viewers in the course of a regular survey, and they were instructe a 
code the verbatim answers into a prepared set of categories. To ad 
complish the task, they were sent general coding instructions and Ta 
cific instructions for each question, similar to the standard coding ү! 
structions used. They were further instructed that if any pu 
swer did not fit any of the code categories, or if they were comp. ЫШ 
unable to decide on the appropriate code, they should indicate it " nd 
codable" in its present form. In the instance of such *uncodable hë 
swers, the interviewer was asked to indicate what additional probe 


would have used to elicit a reply for the purpose of coding. 
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In actuality, the completed questionnaires were entirely fabricated 
and the answers were at different levels of codability, as indicated b 
the variation in the agreement among the interviewers in handling 
different answers, 

The specific aspect of the findings relevant at this point was the ex- 
tent of the tendency to probe when the answer was so vague or con- 
fusing or irrelevant as to require probing. As a criterion for scoring this 
aspect of interviewer performance, four judges, experienced members 
of the NORC professional staff, were independently given the answers 
and asked to perform the same task as that assigned the interviewers. 
Only in the instances where three out of four judges agreed on a 
particular answer was that answer used in scoring the interviewers. By 
reference to this criterion, there was a total of 701 uncodable answers 
among all the answers assigned -to the sixty-one interviewers, The ac- 
tual number of instances where the field staff suggested a probe, i.e., 
indicated that the answer was uncodable in its present form and listed 
an additional probe, was 418. Thus, in 40 per cent of the instances 
where expert judges claimed that the interviewers should have probed, 
they did not. This statistic, however, understates the frequency of 
total Probing errors, in so far as some of the probes suggested for the 
Temaining 60 per cent of the answers were inadequate in content. In 
Order to determine the magnitude of error due to poor quality of prob- 
™g, rather than to mere iUceurrence of probing, the specific probes 
Suggested by the interviewers were again rated by judges according to 
ашу well-established and objective criteria. Of the 418 probes sug- 
Sested, 84 were judged to be of poor quality. In other words, error in 
the total realm of probing occurred for the staff as an aggregate in 52 
Per cent of the instances, , 

Course, any generalization of this statistic is dependent in part on 
milarity between the level of difficulty of the answers used in this 
“Xperiment and the answers obtained in the usual survey. While no 
Ugorous Statement can be made on this problem, it can be said that most 
ue елаз were at a middle level of difficulty, bue my rep at 
ise "E aa of great ease or great difficulty, as indicated by : e c» 
арс еп eld staff rarely showed complete unanimity or complete dis- 
is ent in their replies. In addition, the question of the artificiality of 
ways долова of the experiment limit the generalization. In some 
the e € experiment was easier than the normal field Situation, since 
etg, ee had leisure to consider their behavior, and no con- 
© Cues to hinder their judgment. However, they were operating 
ation where any of the normal aids to decision of a contextual 


the si 
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or a spoken nature were eliminated. Despite these limitations, the gen- 
eral order of findings certainly supports the Guest finding that error 
may very frequently occur through the process of inadequate probing 
behavior. 

It should be noted that many of the errors made in the Guest study 
need not necessarily have been biasing in any systematic direction or 
particularly motivated by anything but carelessness, lack of perse- 
verance due to inadequate job involvement, or simply the inability £o 
distinguish a full and unequivocal response from a vague, evasive, ir- 
relevant response, and/or the inability to think of probes that would 
elicit the “proper” type of response. Thus, it would appear highly 
likely that the amount of gross effect would considerably exceed the 
amount of net effect because many of these errors would probably 
cancel each other.” | 

The Guest study also gives us some information on differential tend- 
encies toward error among the interviewers. There was considerable 
variation between interviewers in the total number of errors, the range 
being from twelve to thirty-six with a mean of nineteen errors. But it 15 
impossible, owing to the design of the study, to determine the degree 
to which this variation may be random. It is interesting, however, tO 
note that every interviewer made at least three probing errors and at 
least three recording errors.'^ All but one of the interviewers made an 
error in asking the questions on the schedule. As for the type of error 
perhaps most likely to introduce bias into the interview, the introduc- 
tion of the interviewer’s own comments, ideas, or suggested pereo 
one interviewer was guilty of eight of the fifteen occurrences, while 
nine of the interviewers did not commit any such errors. This implies 
that while almost all interviewers do tend to commit errors which s 
fect some of the responses recorded for individual respondents, on 
tively blatant biasing behavior is limited to a few aberrant шегинис í 
This conception of the operation of interviewer effect fits the pr! 
and findings presented in Chapters II and III and the findings © în 
field studies of inter-interviewer variation discussed in detail later 


this chapter. | РР" 

The other study using recordings of interviews with pe om 
spondents was made in New York City by the American Jewis dem 
mittee in co-operation with NORC.” In this study, fifteen interv! uae 
were hired ostensibly for a special crew job. They were ace zi 
heterogeneous with respect to previous interviewing сш im 
various personal characteristics. On the whole, though, they d id 
be inexperienced at interviewing, two-thirds of them having 
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previous interviewing experience at all. These were essentially people 
with little or no intrinsic interest in interviewing or in the subject mat- 
ter of the study. They were merely trying to earn a little extra money 
on a part-time basis without necessarily intending to do any interview- 
ing in the future. These recruits were thus more similar to the inter- 
viewers working on the usual crew job than to the permanent inter- 
viewing staff of survey agencies. 

Each interviewer interviewed one to four “planted” respondents, and 
twelve of the interviewers interviewed eight or more uncoached re- 
Spondents whom they selected in assigned households in assigned 
blocks. The general procedure was to have the interviewer first inter- 
view a “planted” respondent playing the role of a “punctilious liberal,” 
2 person incapable of giving an unqualified, categorical response to any 
question. The respondent was instructed to be difficult to interview in 
terms of expressing ambivalent beliefs in all areas but to be friendly to 
the interviewer at the same time. 

Following the interview with the “punctilious liberal,” each inter- 
viewer interviewed several uncoached respondents. Then, he inter- 
viewed a “planted” respondent playing the role of a “hostile bigot.” 
This respondent was instructed to be hostile, unco-operative, and 
Suspicious of the entire situation. He generally required considerable 
Persuasion to answer many of the questions at all and was on the whole 
Quite vicious with the interviewer. 

ы the “hostile bigot” interview, the interviewers interviewed 
> E on uncoached respondents. Then they interviewed another 
Ка respondent, who was coached to present different inter- 
es Y problems to the interviewer, rather than a specific uniform 
ош - or example, in several instances, the respondent who was as- 
of the to the interviewer was ostensibly not at home, but a roommate 
See was there and offered to act as a surrogate for the 
ita ge , Person. In others, an aggressive wife was supposed to intrude 
аена! Interview with her husband, express her own opinions, and in 
16 a make a nuisance of herself. Several respondents were coached 

х Ppear more interested in the interviewer and in the interviewing 

П In the substance of the schedule, to make the situation difficult by 
ЫН н interview the interviewer, albeit in a friendly manner, rather 
pondet ОС themselves to be interviewed. The smal icity of re- 
With the Toles to which the interviewers were exposed, in contrast 
study of and situation in the Guest study, carries us beyond the 
the behav € general process by which gross error occurs. Comparing 

lor of the interviewers as they operate in the different cir- 


t 
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cumstances presumably illuminates the influence of situational pres- 
sures. 

The interviewers were totally unaware either of the fact that any of 
their cases were anything but ordinary, uncoached respondents or of 
the fact that any of the interviews were being tape recorded. Of course, 
the uncoached, regular respondent interviews were in all respects 
normal and were not tape recorded. These latter interviews were in- 
cluded mainly to establish verisimilitude to a normal survey. . 

The tape recordings were transcribed for the analysis. The typewrit- 
ten transcriptions were then compared with the responses recorded by 
the interviewer on the schedule, and the errors found were tabulated. 
Also, the transcriptions were examined for interviewer behavior which 
could be considered as potentially distorting regardless of what was 
recorded on the interview schedule. Errors of this latter type were also 
tabulated. 

Although for the A.J.C. study the classification and tabulation of 
various types of errors was not nearly so refined as that of the Guest 
study, we here, too, are able to learn a great deal about the processes 
through which gross effects occur, as well as their extent. 

The errors made were classified in four broad categories: 

1. Asking errors: omitting question or changing wording of ques 
tion. A 
2. Probing errors: failing to probe when necessary, biased probing 
irrelevant probing, inadequate probing, preventing the responden 
from saying all he wishes to say. din 

3. Recording errors: recording something not said, not recordi E 
something said, incorrectly recording response. . " 

4. Flagrant cheating: not asking question but recording a ri 
recording response when respondent does not answer question eo 

In tabulation, each error was counted equally. On the average, te 
interviewer committed thirteen asking errors, thirteen probing There 
eight recording errors, and four cheating errors on each schedule. my D 
were fifty questions on the interview schedule, but it was possi si 
commit a number of errors on a single question. Still, the error rate zh, 
obviously extremely high. One should only take this finding, ee 
as indicative of the kinds of errors that do occur rather than as "m 
senting the extent of error on a normal survey, since it should cach à 
membered that the "staged" situations were purposely set up "i p dii 
way as to induce the interviewer to make many errors. Although a few 
course of a normal survey an interviewer might well come хан xe 
respondents as difficult as those encountered here, a considerable Р 
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Portion of respondents would normally be far easier to interview than 
the “planted” respondents. In easier interviewing situations, the inter- 
viewers would be far less prone to make errors. Also, it should be re- 
membered that the interviewers employed for this experiment were on 
the whole inexperienced and not regular staff members of the agency 
conducting the survey. These latter factors might also partially account 
for the generally poor interviewing performance. 

. he errors appeared in general to be highly pervasive. Every inter- 
Viewer made at least one error of each of the three non-cheating 
varieties. For this experiment, the number of errors committed by dif- 
ferent interviewers did vary tremendously, but this variation could 
Conceivably have been random. However, the study analysis suggests 
that the interviewer differences are real rather than random, and this 
Seems the reasonable interpretation. 

Cheating errors were less pervasive among the staff. Although every 
terviewer cheated at least once in the “hostile bigot” interview, four 
of the nine interviewers who turned in completed schedules for this 
Tespondent did not really cheat to an appreciable extent. These four 
Tecorded Categorical responses to a few questions which the respondent 
had failed to answer or had answered in an irrelevant or equivocal 

ashion, However, the cheating of these four was of a completely dif- 

erent order of magnitude from the cheating of another four inter- 
Viewers, who completely failed to ask a very large number of questions 

Tom eighteen to thirty-three questions each), for which they re- 
corded categorical responses as if the question had been properly 
asked and answered. These four interviewers clearly fabricated a large 
Proportion of the interviews. A ninth interviewer also fabricated most 
м а interview, but he не нө 

is because he felt he could not bre ght 
spond nt’s hostility. This interviewer can really neither be classified as 
heating or as not cheating. | 

hile we cannot test statistically whether the differences in cheating 
havior observed here represent true differences or whether they are 
biegen agg sen а aes а ib bete чы 

Very large "TER iq. АШ э her findings suggest strongl 
that Se. This fact and a number of other findings suggest strongly 
: there is some basic intra-individual determinant of cheating be- 
Sirm hus, for example, it was demonstrated in these data that the 
ty of cheating behavior between split-halves of the interview was 
much higher than other forms of interviewer error. This demonstra- 
owever, merely reveals that cheating is not affected much by 


in 


tion, 


, 
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minor types of variation occurring within a situation of some particular 
character. Those interviewers who blatantly cheated in the “hostile 
bigot” situation also resorted to cheating slightly more frequently in 
the “punctilious liberal” situation than did the other interviewers. But, 
owing to the over-all only slight incidence of cheating in the 
“punctilious liberal” interview, this difference can only be minor. We 
can say that there was slight evidence that there is a tendency for an 
interviewer who cheats in one situation to cheat in others, at least under 
the conditions of this survey. The evidence of apparent bimodality 
(and almost discontinuity) of the distribution of cheating among the 
interviewers is supported by Guest’s finding that flagrant bias or cheat- 
ing is aberrant behavior—an interviewer either cheats a great deal or 
very little in a given situation. 

_ Yet, even with respect to cheating behavior, the impact of major 
situational pressures is clear. Thus, in the “punctilious liberal” situation, 
there was on the whole very little cheating. The greater extent of 
cheating in the more stressful “bigot” situation was clearly a function 
of the need to cheat in order to escape a painful situation as easily as 
possible. Even here, only half the interviewers interpreted the situation 
as Iequiring cheating. Consequently, interviewer cheating is a function 
of both individual differences and the nature of the situation. 

In the Guest study, cheating was somewhat of a rarity; in the A.J C. 
study, half the interviewers cheated. This was probably due to the 
enormous difference in the difficulty of the situations. The Guest 
“planted” respondent really didn’t encourage the interviewer to cheat 
in order to finish the interview, while the “hostile bigot” situation 
obviously did place a premium on cheating. Since few respondents аге 
as difficult as the "hostile bigot,” the incidence of cheating on the Guest 
study probably approximates normal conditions more closely than the 
А.Ј.С. study. 

We have thus seen that gross effects occur extensively and are 
mediated by certain processes. However, it does not follow that there 
will be Serious consequences on the results, If the effect of a particular 
Interviewer on a specific question were not consistent from respondent 
to respondent, these gross effects would tend to cancel out over re- 
Spondents and there would be relatively little net effect on marginals. 
Gross effects might also cancel out over questions on a single subject 
matter for a given respondent. The interviewer might influence one 
response relating to a given subject matter in one direction and another 
response relating to the same subject matter in the opposite direction. 

The magnitude of net effects will be dealt with directly in the next 
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section. However, certain conclusions can be foreshadowed. There was 
some specific evidence from the A.J.C. study that some, although by no 
means all, of the effects did cancel within subject-matter areas. Further, 
the general evidence already presented, plus additional evidence below, 
indicating that much error arises from situational factors and varies 
over the range of different situations, suggests that there would be 
cancellation across respondents, and perhaps even within the interview 
ofa single respondent. It is, then, clear that at least some gross effects 
would be in a sense random with respect to their influence on the sub- 
Stantive content of the recorded responses. However, there was also 
evidence in the A.].C. experiment, reported in Chapter III, that much 
of the effect appeared to be due to "attitude-structure expectations." If 
attitude-structure expectations were prevalent, one would expect rein- 
forcement of effects in a given subject-matter area for the same re- 
Spondent. We are also led to believe that such expectations would have 
little net effect on marginals but relatively great effect on cross-tabula- 
tions. This is only a speculation, however. At present, we cannot de- 
termine the relative incidence of net as compared with gross effects. 
hile the examination of these tape-recorded interview studies leaves 
many questions unanswered, they provide valuable, definitive descrip- 
tons of what occurred in particular interview situations. Their limita- 
Uons derive from their small-scale character—their use of a small num- 
СГ of interviewers of specific types, and of only a few “planted” re- 
Spondents covering a limited number of types of situations. 

In the preceding chapter, we gave some attention to the extent to 
Which interviewer effect was persistent through time, considered in 
terms of the distributions of responses obtained by interviewers. Here 
We Shall again present evidence on the persistence of effects, but use the 
'ndividual respondent as the unit of analysis. We shall do this by com- 
Panig the reliability of responses of a given respondent when the re- 
Ponss are elicited by the same interviewer each time to the reliability 

responses of a respondent when the responses are elicited by differ- 

t interviewers, Examination of the repeat reliability data naturally 

cars on the problem of whether interviewer effects will be systematic. 

шш of the data to be presented here refer to unchanging factual 

tla of the respondent, which do not Los тизе aie jd 

Brepare : E. by definition, represents error. tie kc ie 

Eos ndings of this analysis provide additiona e Ы g 

чо while the refined treatment of the data provides evidence on 
Ystematic occurrence of such effects. І 

а given interviewer has an influence systematic over time on the 
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responses of a given respondent, then one would expect less variation in 
response to a given question when the same interviewer interviews that 
respondent each time than when different interviewers interview that 
respondent. In order for the difference in reliability under the two 
different conditions to be large, the influence of any given interviewer 
on the responses of any given respondent (through interaction or any 
other means) must be highly persistent through time; i.e., if Interviewer 
A. affects the responses of Respondent I in a particular fashion on one 
wave of a panel, he must affect those responses in the same way on the 
other waves of the panel. We are not directly measuring here whether a 
given interviewer affects the responses of different respondents simi- 
larly (our problem in the analysis of net effects). If the variable in the 
interview situation crucial to the determination of response is the inter- 
action between a particular interviewer and a particular respondent and 
the nature of this interaction is not particularly subject to variation over 
time, there will be considerable systematic effects. But, if among the 
crucial variables are highly ephemeral aspects of the interviewing situa- 
tion, like the time of day, the weather, how the interviewer and re- 
spondent happen to be feeling on the particular day of the interview, 
distractions, and other similar factors which might readily be expected 
to differ between two occasions when a given interviewer is interview- 
ing a given respondent, then in general there will be little systematic 
effect over time, even though responses are unreliable. 

Our data here comes from available panel studies, surveys where the 
same sample of individuals is interviewed two or more times. In many 
panel studies, through accident some respondents are interviewed on 
different waves by the same interviewer, while other respondents are 
interviewed by a different interviewer. These two sets of respondents 
constitute the basis of our comparisons. 

Comparisons from a number of different panel studies are presented 
Below. The results are essentially consistent in that, with rather few ex- 
Ceptions, the responses obtained from respondents interviewed by the 
same interviewers on both waves are somewhat more reliable than the 
Tesponses elicited by different interviewers on the two waves. But, 
these differences in reliability are generally only of moderate magn 
tude. It is also true that there is generally a considerable degree of un- 
reliability to the responses. Since in most instances the actual shift in 
the respondent’s characteristics could only have been negligible, gross 
interviewer effect must have been rather widespread. This is especially 
true in light of the fact that whenever two interviewers produced the 
same error in the responses of a given respondent, or whenever a given 
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interviewer produced the same erroneous response both times he inter- 
viewed a respondent, the interviewer effect is completely obscured in 
the analysis. Thus, we must conclude that some of the more ephemeral 
Situational factors discussed earlier must be highly influential even as 
compared to the more persistent factors in the situation such as the 
Personalities, relative socio-economic status, or age, etc., of the two 
Participants in the interview situation, within the limits of the varia- 
bility of the characteristics of the interviewing staffs involved. 

The earliest study of this type was a study of interviewer ratings 
made by Mosteller.?? In one study, a small national sample of respond- 
ents was interviewed twice with the same interviewers interviewing 
the same respondents on both waves. A three-week period intervened 
between the two waves of interviewing. In a second national study, 
respondents living in cities with more than 100,000 population were 
interviewed by different interviewers on two waves of a panel. In this 
study, the interviews were spaced about two months apart. Another 
panel study using different interviewers was also made in Chicago with 
ne spaced about ten days apart. Even though for x three 
Studies th iver А 4 ; of the samples was 
random а c" Ld ege the time е be- 
tween inte ci (ін ake d г үз mold still appear to be 
send rviews differed, the three studies W 

entially comparable. 
бү pie interviewers on the two national studies эн! a тар са 
intervie, Waves on a five-point economic same "- E opos: 
the rati dim PETS па E respondents оо БЕШ : e E 2 iven re- 
spondents 5 were identical. When different interv € P E ini 
identica]. on the two waves, only 54 per cent o “ae es 
к T d almost completely 
Pondents and thus the inter- 
Viewers us ories on one wave, 
ur cate 
the resp 


he Chicago study sample containe 
of average or higher economic status 
€d a truncated rating scale (three categ n А 
Bories on the other). Even in this situation, only 55 per cent o 
'ondents received identical classifications. i 
Whether po viewers estimated the age of the nt Ls нЕ 
there w he owned a car on both waves of all three suite у ne o 
Ог aske 25 greater reliability when the same interv iewer mad g 
d the question both times than when different interviewers were 
5 w Implications of the greater reliability that existed when = same 
it ^in Interviewed the same respondent twice are “ с м 
9f th пуа three-week period between the first and oe interview 
© panel using the same interviewer, 1t seems very likely that at 
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least in some instances the interviewer remembered how he had previ- 
ously classified the respondent and merely classified him in an identical 
fashion the second time. Thus, the greater reliability attained by the 
same interviewer classifying the same respondent both times may in 
part be an artifact of memory rather than the result of persistent inter- 


TABLE 59 


ReriasiLiTy or Responses to Repeat Questions 1х Turee PANEL STUDIES 


PERCENTAGE Ipenticat CLASSIFICATIONS UNDER 
Dirrerent CONDITIONS * 


lati М А Н 
CHARACTERISTIC National Panel; Chicago Panel; 


National Panel; | Cities over 100,000 | Diferent Inter- 
Same Interviewers | Population; Differ- | Viewers on Two 
on Both Waves ent Interviewers Waves 
on Two Waves 
Estimate of age of respondent 90 71 74 
(10-year class intervals)...... (277) (288) (about 150) 
Automobile 9 86 89 
owrership.. сы, ы ы, машаа (256) (288) (150) 


*N ; 
Numbers in parentheses are the number of cases upon which the per cents are based. 


viewer characteristics, which result in either stability over time in the 
interviewer's perception and frame of reference for Classification of the 
respondent or in stability of the respondent’s reaction to the inter- 
viewer. Thus, while the latter interpretation has some validity, it ei 
likely that the difference in reliability overstates the systematic opera- 
tion of an interviewer's effect. 

A somewhat smaller difference in reliability between ratings of ШЕ 
economic level of respondents made by the same interviewer as against 
ratings made by different interviewers was observed in a panel study 
conducted in Cincinnati." In this study, where a four-point rating waS 
used, 78 per cent of the respondents received identical classifications ОП 
both waves when the same interviewer made the rating, and 68 per cent 
received identical classifications when different interviewers were rat- 
ing. The difference between the differences (the difference between 
23 per cent and 10 per cent) in Mosteller’s and the Cincinnati study 5 
Not statistically significant but is in accord with our expectations a 
cause a six-month interval separated the first and second wave of t 
Cincinnati study. This longer interval would certainly have она А 
Considerably the possibility of an interviewer's remembering how 5 
had previously classified a respondent. Mainly, the persistent factor 
tended to produce differences in reliability between the same inter- 
viewers and different interviewers in Cincinnati, while both persistent 
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factors and memory operated in the Mosteller study. The greater dif- 
ference in the Mosteller study was, therefore, to be expected. 

A number of other comparisons in the reliability of factual data from 
the Cincinnati study are presented in Table 60 below. It should be 
noted in interpreting these comparisons that the study was not executed 
in accord with an experimental design. These comparisons are merely 
a by-product of a regular panel survey; consequently, extraneous, non- 
random, uncontrolled factors may have affected the results. 


TABLE 60 


RrLianiL rry or Сіхсіхаті Елстолі, Data 


PERCENTAGE or RESPONDENTS GIVING 
IvexticaL Rrsroxsrs ox Born Waves 
SSHKKACTERIEUIS When Interviewed | When Interviewed 
by Same by Different 
Interviewer" Interviewer’ 
Education 
се Preden aagus uyapi se SE Сы 77 67 
Ss ollapsed into 4-Class break........- es 82 79 
Pied. of church attendance 
e Dra sot aser to dest nu ee EON T 79 67 
ОЙ рде ecg nea ia ee 
Age Psed into dichotomy... ss esea ens sa siai iis 92 85 
орай ui apa adn vide EEES 98 98 
"vice in World War II 
BURA n g a sau was ыан HIE em өмә 99 98 
C Rewspaper(s) read 
ЖАБЫЛА aufs Af elk WR, ince ГС MU ГУ. 82 82 


* The reli 
ie m Н 
*Dondenu АГУ percentages for the “‘same-interviewer” respondents аге based on approximately 90 re- 
+ The percentages for the "different-interviewer" respondents are based on approximately 410 cases. 


oe panel study which enabled us to compare the reliability of 

е reli €mographic information elicited by the same interviewer with 
Xecur mre d of the results obtained by different interviewers was 
ice à Іп Baltimore jointly by NORC, the Bureau of Applied Social 

ere wa, and the American Jewish Committee. Here, as in Cincinnati, 
intervie as an interval of about six months between the two waves of 
Cinnat; oe The same shortcomings in design as existed in the Cin- 
аз not "€i apply to Baltimore, since the major purpose of the study 
Xperimental, 

e che iique of the Baltimore study are essentially in confirmation of 
respond ts of the two previously discussed studies. The responses of 
oderat] interviewed on both waves by the same interviewer were 

leweq У more reliable than were the responses of respondents inter- 
Y different interviewers on the two waves. 


€ 
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We have presented in Tables 59-61 three completely independent 
demonstrations that some systematic effect of a particular interviewer 
on the demographic classification of a particular respondent does occur. 
But, by and large, the differences in reliability have not been particu- 
larly large considering the extraneous factors involved in the study de- 


TABLE 61 


Rewiasiity оғ Bartmmore Facrvar Data 


PERCENTAGE or RESPONDENTS бтн. 
Ipextican Resronses ох Botu WAVE: 
Роне When Interviewed | When Interviewed 
by Same by Different 
Interviewer * Interviewer 
Education 
6- Class break, а.га nesin ones 63 34 
Collapsed into 4-Class break 75 67 
Income 
57 50 
75 62 
86 79 
Ф PETR К z " Jy 80 re- 
The reliability percentages for the “same-interviewer” respondents are based on approximately 


spondents. The percentages for the 


ly 470 re 
spondents, 


“different-interviewer” respondents are based on approximate 


sign. These comparisons clearly support the conclusion in Chapter Y 
to the effect that there is a considerable fluctuating component to inter- 
viewer effect, in addition to a Systematic component of only moderate 
magnitude. However, the considerable magnitude of unreliability for 
unchangeable factual characteristics supports the evidence presente 
earlier in this chapter that gross effects are large. 

Some evidence on the relative reliability of opinion data collected by 
the same and by different interviewers is also available? For арос. 
data also, the respondents interviewed by the same interviewers on bot? 
waves were in general more likely to give reliable responses than. уе 
those respondents interviewed by different interviewers. The size > 
the difference in reliability varied extremely, but it was not possible E 
determine whether this variation was random or connected somehow 
with specific question content or form. T 

In the 1948 Elmira panel voting study, several interviewers = 
viewed the same respondents on the second and third waves. ү 
two waves of interviewing were separated by an interval of about t 
months. There were two questions which were asked on both waves 9 
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the study. For both of those questions, the respondent was handed a 
card with twelve attributes listed on it and was asked which of the at- 
tributes came closest to describing Truman and which came closest to 
describing Dewey. Almost every respondent mentioned several at- 
tributes as descriptive of each of the candidates. 

One of the reliability comparisons is given in Table 62. 


TABLE 62 


Rewianitity оғ Emma Oprxio Data 


Responses Аттктвотіхс Сосклсе To Truman ох Successive Interviewinc Waves 


Sante INTERVIEWER ox 
= NEEESTEWENON Bors DIFFERENT INTERVIEWERS ON THE Two Waves 
Waves 
First Wave First Wave 
à Did not i Didat 
Mentions sensn Mentioned | mention 
ura- | Жеш coura- | coura- 
доош ав | eous” as Тош! geous” as | geous” as Total 
зш. describing describing | describing 
ruman Раа Truman Truman 
S 
ee e Second Wave 
Menti 
ntioned 1 
“ 
= Ee 
Cous” 7 
eta p 4 13 geous" as 78 15 H3 
4 ME 
EL | 
Did not i 
mention Did not 
абага. mention 
Ecous" as 7 32 39 “coura- 64 468 532 
describin geous" as 
Turn describing 
EE "Truman 
Tor, 
al 
16 36 52 142 543 685 


The method we have used in computing reliability is to take the ratio 
on bod umber of respondents mentioning the attribute “courageous” 
tribute ба to the total number of respondents mentioning the at- 
Composed either wave; in other words, the denominator of this ratio is 
those wh of those who mentioned the attribute on both waves plus 
those Se. mentioned it on the first wave and not the second, plus 
attribute © mentioned it on the second wave but not the first. 4 For the 
SPondents Courageous,” the reliability for the “same-mterviewer re- 

. S would thus be 9/20 or 45 per cent and for the “different- 


Inter, 
lewep | 
Wer” respondents it would be 78/217 or 36 per cent. 
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The reliability percentages for the questions about Truman and 
Dewey, computed in the manner described, are presented in Table 63. 


TABLE 63 


A Comparison or THE RELIABILITY oF RESPONSES OBTAINED WHEN THE SAME INTER- 
VIEWER INTERVIEWED Given ResPoNpENTS ох Boru Waves AND, WHEN DIFFERENT 
Interviewers INTERVIEWED Given RESPONDENTS ON THE Two Waves* 


Ох Слар Question Anout Truman| Ох Слар Question Авоот DEWEY 
Percentage of | Percentage of | Percentage of | Percentage of 
ATTRIBUTE Respondents Respondents Respondents Respondents 
Interviewed Interviewed Interviewed Interviewe 
by Same by Different by Same by Different 
Interviewer Interviewers Interviewer Interviewers 
a 
Courageous.............. 45 36 61 5 
Q0)t (217)+ Q8)t (394)ї 
Conservative.............. 100 28 50 a 
(6) (130 (14) ar 
AME е анде уы 37 24 0 19 
(19) (274) (4) G2 
Elonest. ausos desu are ннн 73 54 50 32 
3) (446) Q6) (478) 
Inadequate DENGUE dip es oto rerit 60 43 29 
(20) (280) (7) (54) 
[o — BIA 0 20 48 34 
4 1 23 (317) 
опей son ate f 2 ( on E 1 
| G4) (422) 0) бр 
оо 1: ЖОН О ИИИ 60 15 56 5 
$ 9 (507) 
Сое ЫЫ: 160. р i 23 
1 15 (78) 
Well-meaning..........., E on © e 
: 35 21 (344) 
Thrifty SAPO SPIES Жа, налу аша $ “J © р 27 
| G) 06) а» oy 
Opportunist...........,., 20 14 70 ч 
(5) (69) (10) am) 
* The percenta ere 


a ibed in the 

"ть Pin ren in the table are measures of reliability calculated in the manner described in the ар 

d е numbers in parentheses indicate the number of respondents involved for each reliability Ре 
ased on the respondents mentioning the attribute on either or both waves. 


It is clear that there was a definite tendency for respondents ep 
ewed by the same interviewers on both waves to give more me 
Tesponses than those respondents interviewed by different interv wn 
There were a few exceptions to this tendency, but almost all the a 
differences were in the direction of greater stability of the responses ce 
same-interviewer” respondents. But, the exceptions and the inciden 
of a number of small differences favoring the “same-interviewers e 
indicate that the systematic effect f modera" 
importance, 


vi 


5 that must exist are only o. 


| Interviewer Effects Under Normal Operating Conditions 251 
A number of opinion questions from the first wave of the Cincinnati 
panel, discussed earlier, were repeated on the second wave of that 
study. The relative reliabilities for a sample of those questions are pre- 
sented here. Again, in this study, there are definite indications that the 
Same-interviewer" respondents tended in general to be more stable in 
their responses than the “different-interviewers” respondents. 


| TABLE 64 


Кешавплтү оғ Opinion Dara iN THE Cincinnati STUDY 


PERCENTAGE GIVING IDENTICAL 
Responses on Воти Waves: 


———  —— 


of Those of Those 
Question Respondents Respondents 
Interviewed by | Interviewed by 
the Same Different 


Interviewer on | Interviewers on 
Both Waves* | the Two Waves* 


Le 5 
md you think there will always be wars between 
5, or do you think someday we'll find a way to 


БЕЛИШ... FR hm a an rn кенеа рва ЕТЕ 78 66 
Countr: о you think it will be best for the future of this 
Stay o Y if we take an active part in world affairs, or if we 
" of world affairs... Lesen 7 70 
Eeneral, are ish issatisfied with thi 
горге » аге you satisfied or dissatisfied with the 
wa 55 that the United Nations organization has made so 
A Бе етс. errr Co 69 62 
Way in A think we can count on Russia to meet us half- 
К бес. ing out problems together? .....--- 76 72 
Uni you read i 3 cer in:thi 
nìited Nations? — anything about the veto power in the T № 
Мо ар unes а ee M КОО, 
агуу Ош €Xpect the United States to fight in another 
x 56 58 


The 
Pere s е 
and on abo, c tages for “same-interviewer” respondents are based on about 90 cases for Questions 1, 2, 4, and 6, 


oi 5 
on aboy don 55 cases for Questions 3 and 5. The percentages for “different-interviewer respondents are based 


for Question» cases for Questions 1, 2, 4, and 6, and on about 260 cases for Questions 3 and 5. The percentages 


heard of thet 3 апд 5 are based on fewer respondents because these questions were asked only of people having 
he UN, spondents 


Anothe 


Over ti T type of test of the extent of systematic interviewer effect 


indices, on can be made with the Cincinnati panel data. Two rough 
tion cone € of interest in international affairs and the other of informa- 
Magnitude re the UN, were set up for each wave of the panel. The 
the бар. € change between the first and second wave for each of 
respon ех computed. In comparing the changes of the two sets of 
Simultan 5 In this way, we are making a compound test, examining 
and “ously whether effects were systematic over different questions 


et $ г 
Pane ег they were systematic over time. 
an absolute value of the change in score is compared in Table 
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65 for the two sets of respondents for both indices. It is clear that neither 
of the differences in the mean magnitude of change in score is even 
near to being statistically significant. In fact, for the information е, 
the mean absolute change in score for those respondents interviewe 

by the same interviewer was actually greater than the mean таир 
change for the respondents interviewed by different interviewers. This 
difference is the opposite of what we would expect if there had been 
systematic effects, The results certainly provide no basis for assuming 
that there are effects that are systematic over both questions and time. 


TABLE 65 
RELIABILITY оғ OPINION Ixpices IN THE CINCINNATI STUDY 
Mean Авѕоготе VALUE ОР 
CHANGE IN SCORE FROM THE ren 
First то тие Seconp Wave (Ratio of Dif- 
ne Repronpsere Wao WERE || ушы || (АЫ. 
A 4 Between | Means to Stand- 
Means Error o! 
Same Different ^ ad 
Interviewer | Interviewers ‚ү 
on Both on the Two 
Waves Waves | >з 
Index of interest in international 
сот ce eoo сым saunter, menned .90 96 06 = 
Index of information about the 
United Nations................ 1.42 1.34 —.08 -5 


We have thus seen that a multiplicity of comparisons from a RUSSE 
different panel studies support in general the fact that there 15 pon 
erviewer effect on the response which is systematic over arae ong 
he several anomalous comparisons and the generally small differenc м 
as well as consideration of such spurious factors as the recollection E 
the part of the interviewer or respondent of the response on the рг * 
ceding wave and the non-randomness involved in the design, make ly 
clear that in general the systematic effects over time are at most E é 
moderate in magnitude. This conclusion on the basis of these р _ 
comparisons is in line with the discussion of systematic interviewer 
fects in the preceding chapter. 


of 
int 
t 


t 
3. DIFFERENTIAL NET EFFECTS AND INTER-INTERVIEWER VARIATION 


Differential net effects and inter-interviewer variation will be з 
cussed together because of the similarity of the study designs use 
the two areas, nd 
‚ А vast majority of the published studies of differential net effects ions 
inter-interviewer variation in the course of normal field operat! 
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show a widespread occurrence of these phenomena with rather con- 
siderable magnitude in various situations and with the use of various 
question-forms on various subject matters. According to the general 
view of these studies, significant inter-interviewer variation is the rule 
rather than an exceptional event.” 

In the course of our work, we have made two studies the designs of 
which were particularly appropriate for the examination of the inci- 
dence of significant inter-interviewer variation. In both studies, several 
interviewers were assigned random samples of predesignated re- 
Spondents from the same universe, so that any variation in responses in 
excess of random variation would be ascribed to some sort of inter- 
viewer bias, 

The first of the differential interviewer effect studies was made in 
Cleveland in 1948, This analysis was done in conjunction with an 
NORC Survey of the residents of three Cleveland suburbs on the ade- 
quacy of their transportation facilities. A systematic random sample of 

Ouseholds within the specified suburbs was drawn from the Cleveland 
ouseholders? Directory. The sample households falling into each 
Census tract were divided into blocks of about fifty households, each on 
Me basis of propinquity. Each of two interviewers was assigned system- 
"UC random halves (alternate sample households) of the sample 
inue dida within each block. There were ten such blocks of paired 
Tvlewers in the study. : 
ess existence of differential net effects among different interviewers 
ested by comparing the amount of difference in the distributions 
ато 291805 recorded by two interviewers in one block ец s 
о difference in the distributions which might X with a 
tati ‘able probability between two samples from a single universe. 
'Stically significant differences were taken to indicate the operation 
fie a rentia] net effect. Most questions were treated as = in 
ence p Chi-squared was used as a test of a of the di d 
givin tween the proportion of the respondents of one eet ег 
the к: д Specified response and the proportion of the pow 1 
order E Interviewer in the same block giving that response. en, in 
tion test for the existence of differential effect on any single ques- 


t ; Д : А s 
lated the Chi-squareds from the ten pairs of interviewers were cumu 


and oe gestions on the survey were mainly of the fixed-response type 
Ping an d With a variety of subject matters in the general area of shop- 
travel habits. The question form also varied, there being a 

< of both fixed-response and free-answer questions. 
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Some forty-five questions were examined for differential net effects. 1 
Of these, only five questions showed significant intra-paired inter- | 
viewer variation at the .05 significance level. The variation between 
interviewers on four of these five questions was very large and had ac- 
cordingly very small probability of occurring by chance from a uni- 
verse with no inter-interviewer variation. These four questions hr 
all subquestions of the two questions on the last place of purchase о 
several items. The questions and results were: - 


Question: “The last time you shopped for (item) did you get them down- 
town or in neighborhood stores?” 


Chi- Degrees of 
squared Freedom P-value 


Gasoline 30.75 10 :001 
Auto repairs 43.21 10 -0001 


“Now I'd like to know about the main earner (main shopper ) of dm { 
household. The last time he (she) wanted any of the following things, di 
(she) get them downtown or in some neighborhood area?” 


Clothing 24.01 10 01 
Housefurnishings 38.04 10 0001 


A full exploration of the possible sources of bias on these particular 
questions appeared in Chapter III, Section 2, and in Chapter V oft 3 
monograph. But this does not concern us here. The important bou 
Sideration here is the fact that on about forty out of forty: -five ара 
and factual questions on this survey there was no particular evidenc 
of differential net effects. «dered 

An additional fact about the research design should be consider?” 
before evaluating the import of this study. The variation that was E 
amined was in all cases the variation between the results of paired ee k 
viewers, Hence, in cases where both the interviewers in a given bie е 
biased their results in one direction and both the interviewers in E 
other block biased their results in the opposite direction, we woul d 
no indication of differential net effect from our test even though P urs 
effect was in operation on the question. Since the interviewers e 
paired within blocks on an essentially random basis, there would pL. 
particular tendency beyond chance for paired interviewers to be pu 
alike in their biasing tendencies than non-paired interviewers: 
some differential net effects may have been overlooked owing 
chance pairings of similarly biasing interviewers. 

Then there is the possibility that our significance tests were to0 M elf 
to pick up differential interviewer effect. It is true that only extre? 


eak 


Жи 
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large differences in the universes would result in significant differences 


between two samples of only twenty-five cases each a reasonable pro- 
portion of the time. Nevertheless, since only one of the forty-five ques- 
tions had inter-interviewer variation that would have occurred with a 
probability of between .05 and .01 with no true differential net effects, 
we can rather safely conclude that on a large proportion of the ques- 
a, on the survey there was relatively little or no differential net ef- 
. These conclusions about the general absence of serious differential 
net effects were also confirmed by our second large field study designed 
to examine this problem. This study was part of the 1949 validity study 
їп Denver discussed earlier in this chapter. The study was designed so 
that each of nine interviewers had geographically equivalent interview- 
ae ая of predesignated respondents іп a single sector of the 
E ithin a sector, there was no clustering whatsoever of respond- 
cit У interviewer. This design was replicated in all five sectors of the 
y. The complete design is discussed very fully in the article treating 
the study. 
eng Sti squared test of significance of the variation between the 
ich sof the different interviewers was made for each sector. Then, for 
question the Chi-squared tests were cumulated over the five 
Sectors, 
B interview schedule used was composed of a variety of different 
volving question. The schedule included fixed-response questions in- 
questi a the use of a card, three- and five-point scales, dichotomies, and 
ist of = where one of the pre-coded responses was not included in the 
respons ternatives stated in the question. There were also several free- 
tics o questions and a number of interviewer ratings of characteris- 
the respondent and his dwelling. 
€ subject matter of the schedule was also quite varied, dealing with 
ссд dent's attitudes to his neighborhood, interest and opinions on 
Nd national issues, voting behavior, and factual characteristics. 
mia standing finding was that significant (at the .05 level) inter- 
ue variation appeared on only eight of twenty-one fixed- 
ation w questions. However, six of the questions with significant vari- 
and ¢ ча sub-parts of a single omnibus question with ten subparts, 
identica] cog two which showed significant variation by E pe 
nly one So, significant inter-interviewer variation yas ound on 
Т 1€ of the seven traditional “factual” questions asked.” The ques- 


ons wi A que 
Signig ith significant inter-interviewer variation and the results of the 
Cance tests were: 


# 


Tespo 


Inte 


nN 
л 
е 
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Fixed-response opinion questions „шы | DESI || Probability 
We are finding out how much interest people 
take in various problems. (Respondent was 
handed a card listing three degrees of interest: “A 
great deal," “some,” and "practically none.”) For 
example, which of those degrees of interest would 
you say you take in —? 
U.S. Policy toward Spain 211.79 120 .0000001 
City planning in Оепуег............... | 237.27 96 :003 
Unemployment іп the 0. 147.24 112 :013 
Denver Negro situation es] 148.15 120 04 
Denver Public Schools ж] MBAS 88 04 
Bresdentialielection... „а aw vac esca oues pas 120.31 96 105 


ference, ог not much difference? 163.33 112 .0008 
Now if something prevented you from voting in a 
Presidential election, how much difference would 
it make to you personally—would it make a great 
deal of difference, quite a bit of difference, or not 
much difference? 136.92 104 015 


Factual questions 


Do you happen to own an automobile at the pres- 
ent time? (If *Yes") Is it registered in your name 
alone, or in your (wife's) (husband's) name also? 184.05 152 04 


The similarity of the form of the question where most of the dif- 
ferential net effects appeared on the Denver study, the omnibus interest 
question, to the form of the question where most of the differential net 
effects appeared on the Cleveland study, the ommibus shopping que 
tion, should be noted. In each case, we have a single question repeate 
over and over again, only with slight variation in the object in the 
question. As one would expect on a priori grounds, on both surveys ? 
few interviewers complained about the dullness of these particular 
questions to the respondents. Not only were the questions deemed to 
be initially lusterless, but it was felt also that the respondents found the 
repetition boresome. Thus, it can be hypothesized that, being eager ы 
go through this part of the questionnaire in a hurry, the interviewe" 
may have become quite slipshod in both the asking of these dull a” 
Tepetitious questions, and in the recording of answers to them. 

_ It is interesting to note that while these seemingly innocuous que 
tions concerning the respondent’s interests showed significant inter 
Interviewer variations, there were Several questions concerning what 


y 
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would appear to be rather affect-laden opinion areas—e.g., political 
affiliation, satisfaction with the community—which did not have any 
such significant variation. It is hard to imagine many interviewers being 
even unconsciously motivated to distort responses to most of the in- 
terest subquestions by anything but a desire to get an unpleasant task 
over with as soon as possible, but one can imagine interviewers getting 
Some gratification out of having respondents give some particular re- 
sponse to more important opinion questions. We may conjecture that 
the obviously greater inter-interviewer variation found on some of the 
interest subquestions than in the more strictly opinion questions may be 
due to factors which we may consider as situational, and this con- 
tributes additional evidence in support of the theory presented in 
Chapter У, 
, Another factor which may have contributed to the high incidence of 
Inter-interviewer variation on the interest questions was an apparent 
confusion on the part of respondents, and possibly on the part of inter- 
шкен, as to the meaning of the questions. From reports filed by 
| Al viewers after the completion of their assignments, there was con- 
siderable evidence that many respondents tended to respond in terms 
of their attitudes in the various subject-matter areas or in terms of the 
ер of interest they felt they should take, rather than the interest 
terug lly did take. Also, a really operational definition of in- 
some was absent, and it is clear that the word had little meaning for 
stand Tespondents and variant meanings among those who did under- 

It. Thus, a great deal was left to the discretion of the interviewer. 
wish datum questions were relatively straightforward in un 
dins nese interest questions. There would seem to have ч ES e 
iier of а respondent failing to comprehend their nye = so the 
whe е s discretion impinged less upon the response. m = 2 
stri ctly ы а given question, the interviewer must es rh га P wá e: 
would Ba ie ав, where he has parece (nnm 
Bree of ite esie o сш hy que dis " si 

r variation to be foun q n. 

вуза ОВА the incidence of substantial inter-interviewer е у» 
"€ y absent for the fixed-response ороп questions an js еле 
hifican questions, there were highly substantial and uim y sig- 
disc ip aros between interviewers in their ae а лога 
significant е ops = i ve rie ye vue oe о 
Per res variation between interviewers in the ПШ 4-2 Pi 
findin Pondent they obtained to free-response questions. hese latter 

85 do not at all contradict the Cleveland findings, though, be- 
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cause the form of the questions from that interview was similar to 
that of the fixed-response and factual questions of the Denver study. 
Thus, in the area where our two studies overlap, the findings are 1m 
essential agreement: that there was little evidence of substantial inter- 
interviewer variation on fixed-response opinion questions and factual 
questions. 

Parallel questions arise in connection with both the Denver and the 
Cleveland studies. The first arises out of the fact that only the nine in- 
terviewers within a given sector are compared with each other. If for 
some reason interviewers within the same sector tended to have the 
same biases while interviewers in different sectors had different biases, 
we would not have discovered differential net effects even though they 
did occur. It is extremely unlikely that this could have occurred on the 
Denver study because the interviewers within each sector were pur- 
posely contrasted in terms of a number of their characteristics such 25 
age, sex, interviewing experience, etc. Since there is no known char- 
acteristic on which interviewers within a given sector were more 
homogeneous than interviewers in different sectors and since each 
sector had nine interviewers (one-fifth of the forty-five interviewers 
used), it seems inconceivable that much differential net effect coul 
have been overlooked owing to this cluster aspect of the design. Я 

Of course, the degree of differential net effects found in any study 25 
a function of the heterogeneity of the total group of interviewers use¢- 
In the two studies discussed in the foregoing, the interviewing staffs 
used were certainly as heterogeneous as a staff working within a single 
city on a particular normal survey generally would be. In Clevelan®, 
the interviewing staff was composed of a few regular NORC jnter- 
viewers and a great many people of varying interviewing experience 
recruited through newspaper ads and similar means. In Denver the 
interviewing crew was even more heterogeneous. Here the inter- 
viewers used came chiefly from two groups: experienced profession 
interviewers on the staffs of national and local research agencies, an 
students of social science at the University of Denver. Thus, there : 
no reason to assume that there was any appreciably less opportunity e 
differential net effects to occur on the Denver or Cleveland survey? 
than there would be on most regular surveys. If anything, the hetero” 
geneity provided greater opportunity than under usual survey opera 
tions, thus making the negative findings even more compelling. Й 
_ Before going further into the nature of the inter-interviewer Уа 
tion that has been found, it would be well to examine our conclusio? 
that "for most fixed-response opinion questions there is relatively little 
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inter-interviewer variation” in the light of other studies which seem to 
indicate the general existence of a considerable amount of such varia- 
tion, Some differences between the design and analysis of the two 
studies discussed and earlier studies with conclusions at variance from 
ours may account for the different conclusions. 

First, there are a number of studies where the over-all distributions of 
responses elicited by different groups of interviewers are compared. In 
several instances, interviewers have not been assigned randomly to 
respondents. When these studies have been based on a national inter- 
viewing staff, there has been a correlation between the town or at least 
the general area in which the interviewer and respondent live. This 
Correlation could of course lead to spurious differences between the 
respondents interviewed by interviewers contrasted in terms of their 
Own opinions if there are positive intra-class correlations between 
ee place and both interviewer and respondent opinion. The dif- 
B between the responses obtained by the different groups of 
E: rviewers are generally tested for significance using a doubtful as- 

mption. It is assumed that if there were no inter-interviewer variation, 
= responses of the respondents interviewed by different groups of in- 
енен would differ from each other to the same extent as would 
eee of respondents in simple random samples of the same sizes as 
Shee, the aggregates of respondents interviewed by the given groups 
“ee This testing procedure unquestionably leads to ums 

ton stimate of the possibility of getting such differences by chance. 
m research workers have been aware of this spurious factor in their 
nie еѕ and have tried to correct for it. For instance, Cahalan, Tamu- 
ation “a Verner excluded questions showing substantial ne vari- 
questio i their analysis. Still, it is probable that even rà t а ке 
Place c : there was substantial intra-class correlation etween specific 
роцро. оріпіоп remaining to inflate the differences € the re- 
tite a Interviewers with different opinions. One = Sopy pe 
Аеш, ifferences in opinions that may exist ана t С € p о 
ог the z ин and the residents of a medium-size in z ria owo 
region = ents of a small farming community 2 ei within a е 
i deeds the possibility that such a spurious sis ia ae E 
Such a desi si нна m taimed by a ala are assign d © 
Mterview, gn. Even within a single city, if interviewers Бей to 
near their own homes, the same Sort of spurious factor could 


ассо à T 
spe t for the relationship between the interviewer's and the re- 
Pondent’s 


bs ses opinion. Thus, we cannot be sure whether studies employing 
sign which have found significant differences between the re- 


260 Interviewing in Social Research 


sponses obtained by different interviewers really contradict our nega- 
tive findings,” { 

A related problem involved in a number of studies is the absence о 
interpenetrating samples of respondents for different interviewers. An 
analytic problem arises, even though there is no reason to assume боле 
relation of interviewer and respondent opinion, since the different in- 
terviewers or the different groups of interviewers whose results are 
compared for the determination of the incidence of inter-interviewer 
variation generally do not interview within the same spatial clusters. 
There is very likely to be a positive correlation between the p 
Where a respondent lives and his opinions and characteristics. In ak 
case, the geographical clustering of respondents would generally ат? 
іп larger differences between the distributions of responses obtained by 
different interviewers than would appear if the interviewers had е, 
assigned simple random samples. This statement would hold even ! 
there were no real interviewer effect. Thus, when these studies ey 
analyzed using assumptions of simple random sampling, or at least fail- 
ing fully to take account of the extent of clustering, one underestimates 
the probability of finding variations between the results of interviewers 
as large or larger than those actually found, by chance, when there 1 
no true inter-interviewer variation. 

In discussing these studies, we shall assume there is no outside knowl- 
edge from oth 


A е ; : е 
areas. If such information were available, it could be used to comput 


а E if- 
; if the purpose of the study is to examine the sd 
ained by interviewers with the different ch 


isti i : Eh en 
acteristics and not simply to establish the existence of variation betwe 
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interviewers per se, this confounding of variances does not prevent one 
from testing his hypothesis. One can simply use, with only a minor ad- 
Justment, the observed variance between the results of interviewers 
with a given characteristic in this particular study, divided by the num- 
ber of interviewers, as the estimate of the variance of the mean for the 
distribution of responses obtained by interviewers with this character- 
istic. Thus, one can readily estimate the sampling variance of the 
difference between the means of the distributions of responses obtained 
by the groups of interviewers with differing characteristics and test for 
the significance of this difference. 

As we pointed out earlier, in most analyses of such material, the as- 
sumption of simple random sampling is made, i.e., the variance of the 
means of distributions of responses obtained by different groups of in- 
terviewers is estimated by assuming that the entire group of re- 
Spondents interviewed by the interviewers with a given characteristic 
Constitute a simple random sample from a universe of all interviewers 
interviewing all respondents (in the given area of the survey). But 
there is good reason to believe that the true sampling variance, the 
Variance correctly estimated by the procedure previously described, 
is Considerably larger than the expected value of the estimate of vari- 
ance made on the assumption of simple random sampling, owing both 
to the Positive correlation between area of residence and opinion of the 
respondent and to the variation between interviewers within a classi- 
ee Hence, it is probable that past studies have overestimated oe 

Xtent of the incidence of differences in results obtained by different 
Stoups of interviewers.** 
© second analytic procedure used in the analysis of studies using 
the ie ei With non-interpenetrating clusters of respondens ok 
any “sting of significance of the inter-interviewer variation, wit ыа 
dis ping of the interviewers in terms of their һаман. P 
б ноп of responses obtained by different interviewers are simp y 
ме ig with each other. Sometimes only the ear of ce ee нн 
Ments, Menu the same city, having ique simi "e mai d 
ön s | having interviewed respondents wit dy de 
telling a demographic variables are compared. Still, tbere is no i d 
differs 9 what degree the respondents in the clusters interviewed by 

ч rus Interviewers might be expected to differ from each other on 

Vant variables even if there were no inter-interviewer variation. 
5, here again we cannot take the findings of such studies at face 


Val en 
eu and must try to judge the validity of the findings in terms of out- 
e knowledge, 


inter 
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A third important factor to be considered in comparing ae -— 
from the Cleveland and Denver studies with those from a number ia dim 
earlier studies is the confounding of inter-interviewer variation bs 
selection or sampling of respondents with inter-interviewer i em 
within the interview itself. In many of the earlier studies, cedet е 
viewers were simply given identical quota assignments pnl: - ^s 
random sample of predesignated respondents. Thus, it is impo a ahe 
such studies to determine whether a difference in the opinions poe 
respondents of different interviewers is due merely to SIRE ш 
in the selection of respondents or whether there is also variation P 

rmance during the actual interview. . " 
Eo Cleveland and Denver studies involved predesignated re 


Where the interviewer was free to choose his own respondents. т 
fact, as well as evidence from two studies devoted specifically ене" 
paring inter-interviewing variation under different conditions 0 w das 
pling, indicates that much of what has been previously interpre 
differential net distortion 4 E 
varying bias in the selection of respondents.*?? While this is, of cour: wa 
significant component of interviewer per 
tion, its true character should not be 
when probability samples 
a function of the differenti 


һ А Р e inter- 
es of inter-interviewer variation. Б Te 
bilities to Complete their assignments of P 


Cussion of inter-interviewer variation, 


3 А itude, 
respondent loss-rate is small in por 
as in the Cleveland study, or the losses are examined to determine nver 
distribution and Consequent effects among interviewers as in the De 


n э è . e ж tota 
study, there is the danger of misinterpreting the origin of the 
Inter-interviewer variation found. 
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As was discussed earlier, studies where the distributions of responses 
obtained by several different groups of interviewers are compared 
generally fail to take account of variation between interviewers within 
a given group (1.е., between interviewers having a given characteristic). 
This factor should be considered in estimating sampling variance under 
the null hypothesis whether or not the different interviewers have been 
assigned interpenetrating random samples. We have discussed using the 
observed variance between interviewers within a classification as the 
basis for estimating the random error when non-interpenetrating clus- 
ters of respondents were assigned to interviewers. This same observed 
Variance could also be used as the basis of estimation even when the 
interviewing assignments are interpenetrating. 

: Another factor that may partially account for the general view that 
Inter-interviewer variation is prevalent is the probable tendency to pub- 
lish only positive findings. Although this supposition cannot be sub- 
Stantiated, it seems likely on a priori grounds that examinations of inter- 
interviewer variation which showed significant variation were more 
likely to be published, being in line with expectations and being, in a 
Sense, Jess equivocal than studies which failed to find significant varia- 
tion between interviewers. When an examination of the data—particu- 
arly when only few interviewers are involved or when each inter- 
Viewer interviewed a rather small sample of respondents—fails to show 
Statistically significant variation between interviewers, there is the 
omnipresent danger that the weakness of the significance tests has led 
to the neglect o f di fferences that are really there, and so one hesitates 
E ae negative findings. Now, of — а bes: aic 
-interviewer variation, 5 per cent of all the sig 
tests made would indicate that observed variation was significant at the 
‚Р er cent level. If our supposition that many tests which failed to show 
Май сап Variation were not published is correct, then it becomes more 
E ely that a fair proportion of the published tests showing significant 
t I" are actually in error—i.e., that they reject the null mo 
null е аге по differences between офтана iru ice inci 
sing pe Is true, the extreme genns d a и ber eatin: 
Cidence ui E i е 2 iei "a not be as much in 
contradieri r-interviewer vanado aga fes apl m 1s 
Ist sight ton to the findings of earlier stu 
here have been several studies made with designs similar to those 
9Ur Cleveland and Denver studies. In these studies, interviewers 


Were assi ned j Я f predesignated respondents 
Sned interpenetrating samples of р gna 
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or households. Thus, the results of these studies are comparable with 
our results, 

Mahalanobis has reported several studies of the variation in the results 
obtained by different interviewers. In connection with the Bengal 
Labour Enquiry, the results obtained by five interviewers were com- 
pared. Significant inter-interviewer variation was found on two of the 
five questions examined. In connection with the Nagpur Labour En- 
quiry, the results obtained by four interviewers were compared. Here, 
significant inter-interviewer variation was absent from all four of the 
questions examined. In connection with two cost-of-living studies, cost- 
of-living indices were computed separately on the basis of each inter- 
viewer's work. In one of the studies, cost-of-living indices based on five 
different interviewers were compared without finding significant varia 
tion. In the other study, indices based on three different interviewers 
were compared with the same failure to find significant variation. Thus, 
significant variation was found on only two of the eleven comparisons 
made. Mahalanobis also reports an additional study, the Radio Pro- 
gramme Preference Survey. Here, each of three independent teams of 
investigators interviewed in one of three interpenetrating samples 0 
respondents. The variation between the three samples was compared 
to the variation that would be expected if the three samples had been 
simple random samples from a binomial population. On fifteen of the 
eighteen questions examined, the observed variance was larger than the 
expected variance, and in seven of those instances the observed varia- 


tion was significantly larger than the expected. But, it is not clear 
whether the three sampl 


Whether there was cluste 
the excess in observed va 
variation or to the spati 
have no information about w 


Shapiro and Eberhart examined differences in the distributions of 1°- 
sponses obtained by four interviewers conducting essentially intensive 
interviews with comparable samples of respondents in a non-field 55 
vey situation. Interviews were conducted with respondents at local VA 
offices rather than at their homes, but since the general form of the 
questionnaire, the subject matter, and general interviewing procedure 
Were not too far different from what might be found in an or dinary 
field survey, the findings are probably reasonably relevant to field su 
veys. The authors report significant or near significant variation be- 
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tween interviewers was found on ten of the thirty-four questions on 
the questionnaire.?* But, it should be noted that the interviewer’s task on 
this survey was somewhat more complex than his task on most of the 
other studies reported here, including the Cleveland and Denver studies. 
Even though a number of the questions used were pre-coded, the inter- 
viewers were supposed to probe intensively on the questions before 
coding the response. Thus, opportunity for variant behavior existed 
m the situation to a greater extent than on the pre-coded questions used 
in the other surveys presented here; in these, the interviewer was ex- 
pected to accept the initial response of the respondent or at least the 
first codable response after a minimum of probing. When one con- 
Siders the opportunities for variation in the intensive interview situa- 
Чоп, confirmed by our very own findings from the Denver study on 
DM in open-ended questions, reported in Chapter Y , the Shapiro 
боје! ue findings are well in accord with our own. 7 It should be 
hi Ш owever, that the interviewers involved in their study were all 
ighly motivated, and three of the four were highly experienced. All 
a Were very well acquainted with the interview schedule and had a 
© s understanding of the goals of the study. 
en and Hochstim have reported a number of different analyses of 
a nterviewer variation in studies using probability samples, but it is 
ар im Which, if any, the interviewers actually had interpenetrating 
chus. i For the sake of the present discussion, we shall assume that in 
of inte ases where the samples did not interpenetrate, the overestimate 
Бе Tvlewer variance was relatively slight, although we cannot be 
of this. They report an experiment made in a medium-sized East- 
designed primarily to examine relative inter-interviewer varia- 
and whe n the interviewer is assigned to a predesignated TEN 
Within ih he is assigned to a specified block but can choose respondents 
* € block on a quota basis. The probability sample part of the 
Significa comparable to our own study. All the data needed to test the 
Biot E ii of the inter-interviewer variation on the probability sample 
Mog о à able, but, from the data that are available, it is clear that, at 
i ume t one of the six questions examined showed variation sig- 
estimated е .05 level (in fact, a negative pic lis n was 
vt still ы ТАЕ У. mon in ate sea mt Sr acd 
Bot be ve ive of the fact that the actua 1 e яй 
a free re гу large). The one question with considera e variation was 
Sponse question. 


Stock anq Ho 


a Bureau of Labor Statistics study in 
1Саро | 


chstim also report а 
» Where each of six interviewers had to determine the selling 
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price of a number of different articles in three different types of store. 
This task was, in essence, an interviewer rating because the interviewer 
had to decide which of the many articles of clothing in the store met 
the requisite specifications and was to be priced. From the data pre- 
sented in the article, it is impossible to test the statistical significance of 
the variation on most of the nine items priced. It is clear that there was 
significant variation on one of the items and that there was no sig- 
nificant variation on two others, but nothing can be said about the re- 
maining six. But even if most of the remaining items did show statisti- 
cally significant variation, this would only again substantiate the 
previous references to the high degree of inter-interviewer variation 
resulting when the interviewer's task involves considerable judgment 
on his part. 

Additional evidence is available from a survey conducted by the 
Bureau of the Census designed to measure inter-interviewer variation in 
connection with their Monthly Labor Force Survey in Baltimore In 
December, 1947.4 The design of this study was somewhat unusual in 
that only pairs of interviewers handled interpenetrating assignments, 
but the same interviewer was generally paired with several different 
interviewers in different Segments. This slight modification in design 
does not affect the comparability of the findings of this study to the 
findings of the other studies already discussed. In the Census study, the 
results of four different interviewers were compared on five questions. 


questions, where the estimate of inter-interviewer variance is negative: 
the variation could not have been Statistically significant. Although 


€ tested precisely, it is very doubt- 


The study was executed through a factorial design so that the varia- 
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tions due to a number of different factors—questionnaires, interviewer 
groups, districts, age and sex of subject—could be examined simul- 
taneously with full efficiency. 

The only factor to concern us here is the interviewer factor. Owing 
to the factorial design, all factors interpenetrate. Hence, each inter- 
Viewer group was assigned an equal number of interviews, to be di- 
vided equally among each of the three specific types of questionnaires, 
and within each interviewer group each type of questionnaire was ad- 
ministered to an equal number of respondents within each age-sex 
Category within each district. Thus, except for random variation with 
Tespect to dependent variables between equivalent four-factor specific 
groups, the three interviewer groups were given completely identical 
assignments, i 

The findings of the Kendall study have appeared in two papers." 

he Durbin and Stuart paper was concerned mainly with variation 1n 
Performance in obtaining interviews with assigned respondents. It 
Seems possible that the variation in this aspect of performance may ac- 
Count for some observed differences in the distributions of responses 
obtained by different interviewers (assuming a correlation between a 
respondents availability for interview and his responses) in studies with 
Father high respondent loss-rates. i à 

* main finding of the response rate analysis was that the students 
Were decidedly inferior to the interviewers of the two professional 
Organizations in obtaining interviews. “Within each group of inter- 
Viewers, there is no evidence of marked heterogeneity among the in- 
ividual interviewers. The results show that the main differences are 
“tween the classes of interviewers rather than between individuals. 

t ds worth noting that a large part of the excess losses of the no 
Ferienceq interviewers was due to refusals. A far larger proportion 
ine апей respondents of the inexperienced A р, Hd > à 

-viewed than of the experienced interviewers. is fact v 


to indi Е c th nerity, ability 
1Саїе that inexperi interviewers lack the temerity, ality, 
ana t inexperienced being in- 


tervies, ‚ры ы не ОГО E eps interpr 
situ ‘Wed. It would also appear likely, then, that 1 fl wna | 
aton itself, the inexperienced interviewers might ап to: pr 
Scent respondent as fully as necessary; the inexperienced das curd 
psht be prone to accept refusals on individual questions or “don’t 
Ows” of a dequate attempt to overcome 


heres n evasive nature without an а S man 
ing, stance; he might also fail to probe as fully as necessaty штапу 
Instance, b 


mie This consideration is in accord with our explanations of ies 
Mg reported in the preceding chapter that the more experience 


kn, 
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interviewers elicited fuller responses to open questions than did the 
less experienced.** 

Booker and David found no clear evidence that differences between 
experienced and student interviewers in results obtained within the in- 
terview were significant. One noteworthy finding on omission of 
questions was that the L.S.E. omission rate was markedly highest on the 
factual questions appearing at the end of the interview, again perhaps 
owing to the reticence or inability on the part of the inexperienced in- 
terviewer to press the respondent after having already asked a number 
of questions. 

The interviewers of all three organizations obtained practically 
identical proportions of noncommittal responses (responses like “don’t 
know,” “no preference,” “nothing in particular,” and “all parts” when 
the respondents were supposed to choose between alternatives that were 
matters of opinion rather than information). The absence of difference 
in this respect between experienced and inexperienced interviewers is 
rather remarkable. This result certainly detracts from the credibility 
of our hypothesis in the Denver study of greater reticence and inability 
to probe on the part of inexperienced interviewers. 

No consistent pattern of differences was found with respect to the 
number of responses obtained to questions permitting multiple answers. 
Thus, these results question the generality of our finding in the Denver 
study that experienced interviewers elicited more multiple responses 
than inexperienced interviewers, The basis of the contradiction is not 
clear-cut, although conceivably the British experienced interviewers 


had less practice with open or other multiple-response questions than 
had their American counterparts, 


Thus far we have discussed variati 
three groups in terms of certa 


stead of the content of the responses themselves. Booker and David also 
examined such differences in 


counted for in terms of sampling variation alone, it should be remem- 
bered that some of the signi 
viously discussed differences in refusal rates or similar factors extraneous 
to the interview proper. Thus, here again there is relatively little evi- 
dence for the existence of widespread variation in results due to be- 
havior during the interview itself. 

By and large, there seemed to be no reason to assume that any of the 
differences in the distributions of recorded answers had anything to do 
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with the fact that one group of interviewers was inexperienced and the 
other two were experienced. This fact confirms our general notion 
that much of the inter-interviewer variation that does occur is non- 
systematic in character. 

Actually, there is little reason to expect variation in the substantive 
content of responses obtained by groups of interviewers contrasted 
merely in experience. Variation between groups of this sort would be 
expected to be along more formal lines—e.g., the number of responses 
elicited, number of evasive responses, etc. The variation in substantive 
responses would be perhaps more affected by a factor like interviewer 
expectations than by the experience factor. Nevertheless, the Kendall 
study is significant because of its unique application of a factorial de- 
Sign to the study of interviewer effect, and because of the contribution 
of its specific findings. 

We have thus far seen that, in studies where the equivalence of the 
assignments of different interviewers has been insured through the pre- 
designation of randomly selected respondents, the prevalence of statisti- 
cally significant inter-interviewer variation has been rather low. It is, 
of course, true that in most of these studies each interviewer inter- 
viewed rather few respondents. Thus, the significance tests were on 
the whole rather weak, and so real but small differences between inter- 
Viewers were often overlooked. Still, when one considers the extent 
of the tests made and their general agreement as to the absence of sig- 
nificant variation on at least a majority of the fixed response pre-coded 
questions requiring a minimum of interviewer judgment, it does not 
Seem possible that substantial inter-interviewer variation could be very 
Widely prevalent on such questions. 

, Tet, in earlier chapters, we showe à 
Viewer distortion (expectation effects, clerical errors, reaction effects, 
Ste.) did occur, and in the earlier parts of this chapter, we indicated 
through the validity studies, the recorded interview studies, and the 
Panel studies that gross effects did occur in field studies. These findings 
of gross interviewer effects on responses would appear to be somewhat 
n Contradiction to our conclusion that substantial inter-interviewer 
Variation was not particularly prevalent. Two important considerations 

Elp reconcile these divergent findings. — нанне ko 
ne gross effects need not vary particularly from vais di E 

terviewer, All interviewers can bias their results in mor” | 
Same fashion, and thus the distributions of responses obtained by dif- 

‘rent interviewers need not differ particularly even though they are 
all affected, 

The second consideration involves the fact that only net effects show 


d that certain processes of inter- 
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up as inter-interviewer variation. If we consider an interviewer as hav- 
ing a strong need to find all of his respondents agreeing with him on 
every issue (or even disagreeing with him) and if there are differences 
in opinion among the members of the interviewing staff, then we'd ex- 
pect large net effects to occur and along with them substantial inter- 
interviewer variation. But, if we view the interviewer as being es- 
sentially task oriented and as engaging in biasing behavior or making 
other interviewing errors solely to expedite getting his job done as 
painlessly as possible, then there is no particular reason why distortions 
of individual Tesponses may not simply cancel out over a number of 
respondents. In general, the preceding chapters tend to support a view 
of an interviewing situation in which the interviewer is mainly task 
oriented—involved in getting his job done, not so much concerned 
with what his respondents say. Thus, it is not contradictory that each 
interviewer should distort a large number of individual responses, but 
that the distributions of responses obtained by different interviewers 
should in general look much the same. 

There is no particular reason to assume from this that different inter- 
viewers will get the same responses from a single respondent or a group 
of respondents. As was indicated earlier in this chapter, a single inter- 
viewer interviewing the same respondent twice is more likely to get 
the same answers than are two different interviewers interviewing the 
same respondent. Although there is undoubtedly a great deal of random 
or situational error in interviews, it still seems very possible that dif- 
ferent interviewers may exert differential net biases on given re- 
spondents or subgroups of respondents.“ These individual biases may 
cancel out to a large extent when the total assignment per interviewer 
contains a number of respondents or a number of groups of ге- 
spondents. 


Some interesting findings in a study by Mahalanobis illustrate just 
such a situation where net differenti 


with the Nagpur Labour Enquiry, discussed earlier in this chapter, the 
four interviewers obtained different results within certain given areas, 
but the differences they obtained were not constant from area to area. 
Yet, on none of the four questions analyzed did the aggregate distri- 
butions of responses for the 
The biases apparently canceled out over the five areas. Thus, if the 
study had been made in only one or two of the five areas, we might 
have concluded that there was significant inter-interviewer variation. 
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It is doubtful that such situations are very common, but this particular 
finding is interesting as an indication of how differential interviewer 
bias can exist and still not be manifested in marginal distributions.“ 

On almost every study examined, some questions did show variation 
between interviewers. Two questions arise about the nature of this 
variation: In what manner did the distributions of responses differ from 
each other, and how were the variant distributions compounded out of 
the total interviewing staff? 

With respect to the first question, the only reasonable answer seems 
to be that absolutely anything can happen. If the interviewer distortion 
stemmed mainly from the desire of the interviewer to have respondents 
hold certain opinions, then one might expect the responses obtained to 
be pushed in a single direction or conceivably toward a “don’t know” 
category.” А | 
, In practice, we occasionally find distributions of responses differing 
In this manner. These differences may have arisen in a situation where 
the interviewers were concerned with the content of the response. But 
there are numerous situations where we find differences which could 
Dot readily arise through a content bias. For instance, there are situa- 
tions where there аге too few responses at both ends of the continuum 
and too many heaped into the middle category. There are situations 
where the middle category has too few responses, and both ends of the 
Continuum have too many. There are even situations where the don t 

now” category has too many responses, and both ends and the middle 
9f the continuum all have too few. One gets the feeling from viewing 
Such cases that it is not so much concern with the substantive content 
of the response that leads to inter-interviewer variation as it is differ- 
ences in the perceptual frame of reference of interviewers when they 
Code responses in the field, when they select parts of answers to open 
questions to record, or when they decide which answers need probing 
and which don’t, Interviewers have different criteria for judging 
whether a response adequately answers a question or whether it re- 
(res further probing. Then, there are, of course, variations in inter- 
Viewers’ ability to think of proper probes for vague responses, as well 
5 Variation in their morale, or their desire to do a good interviewing 
Job. Factors like these can explain how the distributions of responses 
Сап vary with no apparent relation to the substantive contents of the 
questions. 
ү: Similar conclusions as to the nonsubstantive source of as id the 
ariation between interviewers were reached by Shapiro and Eberhart. 
t should be remembered that, in their study, extremely large differences 
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were found between the distributions of responses obtained by dif- 
ferent interviewers on a number of questions. These differences oc- 
curred in the proportion of “don’t know” and “not ascertainable” re- 
sponses as well as in positive response categories on attitude questions. 
We shall quote at length from their discussion of interviewer variation 
because of its relevance for our own discussion here. 


The study of interviewer bias has most often been concerned with the 
influence of such factors as the interviewer's social or racial status and 
personal opinion on responses obtained to attitude questions. The emphasis 
on these sources of bias should not lead one to assume that controlling them 
will solve all or even the greater part of the problem of bias. Unfortunately 
the problem of interviewer bias is frequently complicated by the presence 
of factors which are unrelated to status and opinion but which are a direct 
function of interviewer performance. з 

The characteristics of the interviewers ruled out the possibility that dif- 
ferences in status were large enough to produce differential biases апап 
them. Furthermore, it was clear... . that they were thoroughly aware o 
the necessity for not influencing responses by suggestion. > 

In the analysis of the interviews with on-the-job trainees it was possible 
to separate from the general area of interviewer bias the following devia- 
tions from ‘good’ interviewer performance which contribute to bias: (а) геч 
liance оп an initial response; (b) incomplete reporting of the respondents 
answers; and (c) independent decisions by an interviewer concerning the 
necessity for asking questions included in the schedule. The succeeding 
paragraphs demonstrate how each of these variations operated in a specific 
attitude question to produce a bias.5? б 

It is apparent from the analysis that the errors were not equally dis- 
tributed among the four interviewers, In about half the instances of inter- 
viewer difference, A was the principal variant . . . his interviews reveal the 
effects of some attitudes that did not characterize the other interviewers. 
These attitudes had to do essentially with method, and not with the subject 
matter covered by the interview, 

- +. it is useful to comment briefly about the kinds of interviewer differ- 
ence found in the present survey... , А 

Instances of apparent interviewer bias on attitude questions were dis- 
covered. These appeared to result not from variations in the interviewers 
own attitudes toward the topics covered by the questions, but from differ- 
ences in the interviewing methods used. . . , 

There were fewer differences between interviewers in classifying re- 
spondents’ answers, but instances did occur. This kind of variation can 
occur as frequently as interviewers are required to perform also as coders. 
In classifying information after the respondent has given it to him the inter- 
viewer must use his own judgment as to the meaning of the reply and the 
meaning of the answer categories he is Supplied with. These judgments can 
vary widely from interviewer to interviewer if the categories lack precision 
or if the interviewers are inadequately trained.™ 
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This explanation of inter-interviewer variation fits very well with the 
fact that, on the whole, variation is not highly prevalent. For, if the 
substantive content of the response is not the main factor underlying 
interviewer distortion, it can readily be seen that various distorting er- 
Tors made by an interviewer could cancel each other frequently over a 
Series of respondents. This consideration gives further credence to our 
Vie of the nonsubstantive source of a great deal of inter-interviewer 
Variation. 

We do not wish to imply here that no interviewer variation origi- 
nates out of the classical substantive source. Obviously, there are some 
Interviewers who on some questions on some surveys have a strong 
Predisposition to get certain responses owing to their own expectations 
or ideology. We certainly have viewed distributions distorted unidi- 
rectionally as in the models presented earlier, and in many instances 
this distortion was in the direction of the interviewer’s own ideology. 
But, We cannot tell in any particular case what the basis of the distor- 
tion was, and we wish to stress here that in many instances neither the 
Interviewer’s own ideology nor even his expectations need have been 
the basis for his distortion of responses. 

_ With regard to the distribution of variant tendencies throughout the 
Interviewing staff, we have relatively little evidence owing to the small 
nuniber of cases interviewed by each interviewer on most studies and 
owing especially to the small number of interviewers used in most of 
these studies. It is our general impression, though, that for most ques- 
tons, most interviewers get more or less the same distributions of re- 
ше while a few interviewers get highly aberrai 
Nee, the significant variation on the interest s : 
enver study, discussed earlier in this chapter, was due in several in- 
Stances to the fact that one or two of the nine interviewers in each of 
two or three sectors reported a large proportion of “don’t know” re- 
‘Ponies while all the remaining interviewers reported very few such 
oa, а other cases, the are pss Main da "e s 
vers in one sector got far tewe P І ) 
Category than did other interviewers, while on the same question in 
Some Other sector, one interviewer pushed most of the responses in the 
^e: of an extreme category.” 

nly in the rarest instances have we no 
o 9 nearly equal-sized groups of interviewers where each member of 
oth 8704р obtained one type of distribution while each member of the 

Т group obtained a considerably different distribution of responses. 


nt distributions. For 
subquestions on the 


ted a bimodal distribution— 
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Thus, either there is little net interviewer effect or most interviewers 
tend to distort their responses in the same fashion. But, on some particu- 
lar questions, a few aberrant interviewers engage in highly idiosyncratic 
behavior and turn in results considerably different from those of the 
majority interviewers. This phenomenon of the aberrant interviewer 
emphasizes the danger of predicating generalizations about interviewer 
effect on experiments involving the comparisons of only a limited num- 
ber of interviewers. Only when the results of the aberrant interviewer 
who happens to be included in the study can be incorporated into à 
large body of results from many interviewers, can we attenuate his in- 
fluence on our generalizations. 

This distribution of distortion throughout the interviewing staff on 
particular questions fits well with our conception of the basis of distor- 
tion. If the substantive content of the response were the main de- 
terminant of distortion, then one would expect that on questions where 
interviewer opinions or expectations were reasonably equally divided, 
the interviewers would obtain some sort of bimodal distribution of re- 
sponse distributions—a considerable proportion of interviewers would 
get response distributions biased one way while a considerable propor- 
tion would get response distributions biased in the opposite way. But, 
if distortion enters through the misunderstanding or the disobedience 
of instructions, then a J-Curve situation would exist—most of the inter- 
viewers would get about the same results, but a few would occasionally 
get highly deviant distributions. 

It also should be noted that it is not always the same interviewers who 
are aberrant on different questions. Although we have shown that there 
is some systematic component to interviewer performance in that there 
is a positive intercorrelation in the number of multiple answers obtained 
by interviewers on different questions and a positive intercorrelation 
in the proportion of invalid responses obtained by interviewers on dif- 
ferent questions, these intercorrelations are generally of only a moder- 
ate magnitude. There is plenty of room left for interviewer performance 
to vary from question to question as illustrated by the low intercorrela- 
tions in unreliability over different questions from the Elmira political 
study discussed in Chapter V. Actually, there are many instances where 
an interviewer obtained a highly deviant distribution of responses on 
one or two questions but not on others, while interviewers who were 
not deviant on these first questions were deviant on one or two other 
questions. Thus, inter-interviewer variation appears generally to be a 
highly idiosyncratic rather than a Systematic phenomenon. 


CHAPTER VII 


Reduction and Control of Error 


An underlying purpose of the interviewer effect study was to lay the 
groundwork for a systematic approach to the reduction and control of 
error arising from the interview process. Before we could consider 
methods of accomplishing this control, it was necessary to learn as 
much as possible about how, under what conditions, and to what ex- 
tent interviewer effects operate. In the preceding chapters, therefore, 
we have explored the nature of the interview situation, examined some 
of the specific factors which bring about interviewer effect or error, 
and provided some evidence on the total amount of error actually oc- 
curring under normal field conditions. 

On the basis of the evidence presented in Chapter VI, it might appear 
that the magnitude of error under normal field conditions is so negli- 
gible that there is no need for lengthy discussion of methods for con- 
trol or reduction of error. This would be a most hasty conclusion for 
а number of reasons. Even if one were to grant that the sources of 
Potential error seem to be under control at the present time, since er- 

: Tor is not manifest, this might simply mean that survey agencies have 
managed to hit upon lucky procedures. Such luck is hardly insurance 
against error in general. The history of election forecasting provides a 
most appropriate analogy to the present problem. The successful fore- 
Casts of a dozen years did not preclude a failure in 1948, and upon 


analysis it seems that the success was based on a precarious system in 
ol, or had been 


Which certain errors had temporarily been under contr 
In abeyance, given certain situations, or operated in totality in such a 
Way as not to jeopardize the final results, A far better insurance of 
Uture success than mere past success is systematic knowledge of the 
Process underlying interviewer effect, and systematic discussion of 
methods of control. 

It should also be noted that the evidence presented in Chapter VI on 
the magnitude of error under normal field conditions is neither massive 
enough in quantity nor based on a sufficient sampling of types of field 


Conditions to permit us to conclude that the results of normal surveys 


are not seriously distorted by interviewer effect. The two large-scale 
ed on the staffs of one 


field reported in that chapter are both based on 

t ld agency, NORC, and cover, of necessity, а limited range of con- 

ents and situational factors. These studies were supplemented by evi- 
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dence from other studies in an attempt to get an estimate of the problem 
that would be more typical. But still the question of evidence on 
normal field surveys poses a sampling problem far more difficult than 
the sampling of humans, and one which the statisticians have hardly 
touched. 

For these reasons, it is desirable to deal with the reduction and con- 
trol of interviewer effect, and to summarize the implications of the 
earlier chapters for the problem. А 

It will require time and research to develop the implications of this 
study for error control. Greater understanding of the interview situa- 
tion provides no magical formula for eliminating interview bias or er- 
ror, but it should help to define the appropriate directions for research 
to take and to correct misapprehension as to the factors which need 
most attention. Immediate or short-run solutions will have to be gus 
plored within the framework of the particular problem and the admin- 
istrative and operating limitations involved. But the conditions of pres- 
ent-day research must not be regarded as fixed and unalterable, if а 
serious attack on some of the fundamental sources of bias is го be made: 
In this chapter, we shall discuss some of the methods which may be ef- 
fective in reducing or controlling error as suggested by this study and 
by the research of others. | 

Approaches to the problem of reducing error may be classified into 
three groups: 

1. Empirical methods which attempt to remove or diminish the 
source of error, so that minimum error will occur in the interview. | 

2. Empirical methods which may allow effects to operate in the in- 
terview, but seck to bring about a cancellation of effects over all inter- 
viewers or to produce homogeneity among interviewers so as to elimi- 
nate at least the differential effects of different interviewers. 

3. Formal or mathematical methods which allow effects to operate 
in the interview, but attempt by analysis or measurement of the mag- 
nitude of the effects to minimize or estimate their influence on final re- 
sults. 

The methods employed to remove the source of error will depend on 
what the source is. Methods which aim at the cancellation of effects ОГ 
at minimizing or estimating them by analysis and measurement apply 
generally to error from all sources. 


1. CONTROL OF ERROR ARISING FROM FACTORS WITHIN THE INTERVIEWER 


Empirical approaches to the control of interviewer effects through 
the manipulation of the interviewer may take the form of improve- 
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ments in selection and training of interviewers or improvements in gen- 
eral personnel policy which will reduce turnover among the better in- 
terviewers or attract people of superior ability to interviewing work. 

Improvement in the selection of interviewers requires some decision 
on the part of survey agencies as to what particular traits are desirable 
in an interviewer. If all kinds of interviewer error were positively and 
highly correlated, this problem would not arise, but in so far as skills 
are independent, some choice has to be made as to which skills are 
primary. 

The essential phases of the interviewer's work are: — | 

1. Sampling. The interviewer must be able to follow instructions for 
probability sampling or to use good judgment in selection under quota 
controls. 

2. Obtaining accurate information. The interviewer must be able to 
get respondents to answer fully and truthfully, so that the opinions they 
express are not influenced by the interviewer. Social skills, accuracy in 
asking questions, and skill in probing are required in this phase of the 
work. : 

3. Recording, ‘The interviewer must be thorough and accurate in 
recording the respondent's answers. : 

An interviewer may be skilful in one of these phases but not in an- 

other, The interviewer who is careless in the clerical work of recording 
answers may use excellent judgment in probing equivocal or vague 
answers in an unbiased manner. An interviewer skilful at getting the 
respondent to “open up” may find it difficult to follow complicated 
sampling instructions or may be prone to obtain or record too many 
responses in accord with his own expectations or Opinions. 
Before improvement in selection of interviewing personnel can come, 
is essential to know to what degree these skills are compatible with 
each other and what types of individuals are most likely to have com- 
bined skills, 


it 


Intercorrelations of Interviewer Skills 


, An unpublished study of the American Jewish Committee described 
10 some detail in Chapters III and VI provides some evidence on the 
intercorrelations of interviewer skills based on actual observation of the 
Interview itself by means of concealed wire recorders and on compari- 
Son of the recordings with the completed schedules. Where interviewer 
Performance is judged only by examination of the completed schedules, 
Some of the more important components of interviewer skill cannot be 
adequately evaluated. The schedule may be completely filled out, with 
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adequate replies on free-answer questions, but the central office can 
only infer the interviewer’s skill in probing, his ability to obtain good 
rapport with the respondent, or his accuracy in asking the questions and 
recording the answers. There is nothing to show definitely whether the 
answers on the schedule really represent the respondent's views, 
whether the interviewer exhibited biasing behavior by projecting his 
own opinions into the interviewer situation or even "made up" the 
answers himself when he failed to ask the question or the respondent did 
not reply. 

"Table 66' gives the intercorrelations among four types of errors and 
the correlation of each type of error with the total number of errors. 


TABLE 66 


INTERCORRELATIONS OF Types or Errors IN A.J.C. Ѕтору 


Probing Recording Cheating Total 
Errors Errors Errors Errors 
EIS 
23 440 —.12 53 
58 24 81 
04 71 
53 


The intercorrelations among asking, probing, and recording errors 
are all positive, although only the probing-recording correlation is S18- 
nificant at the 5 per cent level.? The results suggest a moderate degree 
of association between the various abilities. The low correlations 0 
cheating errors with the other kinds are largely an artifact of the 
method of scoring; when the interviewer failed to ask the question but 
nevertheless supplied an answer, no other error could occur on that 
item. This also explains the negative correlation between cheating and 
asking errors. However, the correlations do indicate that cheating be- 
havior is not closely related to errors in general. 

Since each interviewer had only two or three respondents and these 
respondents played the same roles for all fifteen interviewers, the 
validity of the intercorrelations is not certain. They may partly meas- 
ure characteristics of the particular respondents, as well as those 0 
interviewers, Intercorrelations based on a large sample of respondents 
in situations offering a more normal variety of stresses might be differ- 
ent. 

A laboratory experiment to test probing ability of NORC inter- 
viewers, which was described in Chapter VI, gives an opportunity tO 
compare probing skill in a laboratory situation with the regular over-all 
interviewer ratings based on field performance as determined from the 
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completed schedules. From the results of the experiment, a “probing 
tendency” score was calculated for each of sixty-one interviewers. A 
score of 100 means that the interviewer probed the answers he received 
with the average frequency for all interviewers receiving these answers. 
The scores ranged from 26 to 171. In order to examine the association 
between probing behavior and the regular interviewer ratings, the 
average of the last three ratings was used to obtain greater stability. 
Interviewers were divided into two roughly equal groups—the thirty 
highest in this average rating compared with the remaining thirty-one. 
The distribution of “probing-tendency” scores for high- and low-rating 
Interviewers is shown in Table 67 below. 


TABLE 67 


Comparison оғ Рковіхс $кил.в ултн Recutar RATINGS 
(61 NORC Interviewers) 


“Probing Tendency” Score Ta Зай Group Boe 
Less than 50 1 3 x 
METO краш» si 1 5 
| ne 5 1 = 
91-110.. 9 1 E 
111-130. 8 6 S 
131-150. . 3 н i 
Over 150 3 1 * 

30 31 9l 


There seems to be some association here, but it is not very strong. 
he mean probing tendency score for the high-rating group was 106 
aS compared with a mean score of 94 for the low-rating group. The bi- 
Serial correlation between ratings and probing Scores is .25. The dif- 
erence between means and the biserial correlation co-efficient are both 
à little short of significance at the 5 per cent level. It does seem to be 
‘tue that the very low probing scores, which indicate that the inter- 
Viewer was far below the average in ability to perceive uncodable 
answers which needed further probing, were obtained almost entirely 
Y the low-rating group; of the fourteen probing scores below 80, 
cleven were obtained by interviewers in the low-rating group. 


Urther evidence on intercorrelations of interviewer skills is given 


m Sheatsley’s study of the interviewer labor market? Each NORC 
interviewer is rated regularly on (1) his performance on free-answer 
questions, as measured by the completeness and relevance of his ver- 

atim and free-answer material, (2) his clerical ability, as defined by 
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the interviewer’s apparent skill in asking the questions properly and 
recording the answers accurately, and (3) his sampling ability, which 
is determined by his faithfulness in following instructions in making 
his selections under quota controls. These three ratings provide some 
measure of the interviewer’s performance in the three essential aspects 
of his work, in so far as this can be determined from examination of 
the completed schedules. 

Table 68 presents the correlation co-efficients among these measures 
of performance, based on average ratings over a period of time. 


TABLE 68 


INTERCORRELATIONS BETWEEN INTERVIEWER SKILLS 
(Based on 1161 NORC interviewers) 


Tetrachoric 


Correlation 

Co-efficient 
Free-answer ability and clerical ability « г io stares 52 
Clerical ability and sampling ability............. 46 
Free-answer ability and sampling ability......... 33 


There is no question as to the statistical significance of the cor rela- 
tions based on 1,161 interviewers. The fact that they are all positive 
and moderately high indicates that the skills measured are not com- 
pletely discrete. А 

It was not possible to determine how the intercorrelations vary with 
experience or by type of interviewer. The lower correlations of san 
pling ability with free-answer ability may be partly spurious, for an 
interviewer who rates low on sampling ability because he selects too 
many upper-class educated persons may rate higher in free-answe 
ability simply because such respondents are more likely to talk freely: 
Also free-answer ratings, based only on the completed schedules, ha 
to be taken as a measure of the interviewer’s ability in probing ап 
rapport as well, 

Sheatsley concludes that “nevertheless, the data do indicate that 
most NORC interviewers tend to be generally good, generally fair, 07 
generally poor.” 

The relatively high correlation between free-answer ability a ii 
clerical ability does not seem to support the notion that precise, тепси 
lous persons аге likely to lack social skills. Several explanations may be 
suggested: ; 

_ 1. A person markedly lacking in either social skills or clerical ability 
is not likely to be hired as an interviewer. 

2. “Clerical ability,” as measured by the ratings, is quite different 
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from traditional clerical ability, as measured, for example, by the 
standard Minnesota Clerical Test. Ability in asking questions and 
recording answers in a social situation like the interview requires some 
social skill, as well as the exercise of judgment and intelligence. Guest 
and Nuckols found practically no association between scores on the 
Minnesota Clerical Test with interviewer recording errors, even іп а 
laboratory experiment.* But they point out that a special kind of cleri- 
cal ability is required in interviewing, and that the type of clerical task 
performed on the Minnesota Clerical Test could not with certainty be 
expected to predict this type of performance. McRae found that cleri- 
cal ability (measured by omission of questions or failure to record 
answers in the interview) was associated with ability to handle the 
enumeration process which involves an interpersonal relationship with 
the respondent, but not with the other paper work involved in follow- 
ing directions on an area sample such as listing dwelling units, etc. If 
We consider that “free-answer ability" requires the most skill in inter- 
personal relationships, “clerical ability” the next greatest skill, and 
sampling ability the least, it is consistent that "free-answer ability" 
Should be most highly correlated with "clerical ability" and least 
with sampling ability. 

3. “Free-answer ability” is not solely a matter of social skill or ability 
to obtain good rapport, but also requires the exercise of judgment and 
Intelligence in probing and recording responses, qualities which would 
Seem related to “clerical ability.” The moderately high positive correla- 
tion (.58) between skill in probing, an element of “free-answer ability,” 
and recording accuracy, an element of “clerical ability,” cited earlier 
from the А.Ј.С. study, is further confirmation that the two abilities 
are related through common underlying elements, $0 that we would 
Expect a fair degree of correlation between the two ratings. Even if 
We supposed that “free-answer ability" consisted of 75 per cent social 
Skill and 25 per cent intelligence, while clerical ability consisted of 25 
Per cent social skill and 75 per cent intelligence, the correlation be- 
tween them would be about .60.° There is reason to believe moreover, 
that social skills are not as important a determinant of the free-answer 
Tating as in this example, for there is evidence that some of the ele- 
ments that enter into “free-answer ability” —probing skills, for ex- 
ample—may not be closely related to social skills. In the A.J.C. study 
referred to previously, judges’ ratings of the naturalness, friendliness, 
ma rapport of the НУ Чи show no positive correlations with 
ег recording or probing skill. 


Guest also obtained results in an earlier study which suggest that the 
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correlation between “naturalness” and interviewer competence as 
measured by lack of errors is low or negative.” In a later study, Guest 
and Nuckols found a negative correlation (.32) between "agreeable- 
ness" and performance, as measured by lack of errors? In another 
study, Keyes noted some tendency to superior performance for “me 
trovertive” personality groups and those with “low social adjustment 
generally, especially in probing ability, although the differences were 
not clearly significant.® Finally, over-all NORC ratings for interviewers 
whose past job experience involved persuasion or approach were 
lower than for other interviewers, although their average scores on 
"free-answer ability" were fairly high. 

The cumulation of this evidence leads to the tentative conclusion 
that, although social skill plays some part in the survey interviewer's 
Work, it is not closely related to the other skills demanded by the job, 
and that excessive social orientation of the interviewer is not conducive 
to superior performance. This view is reinforced by the qualitative 
material presented in chapter II. Earlier conceptions of the interview 
Process have emphasized its social nature and in consequence have 
tended to enthrone good rapport as the sine qua non of the successful 
Interview, and to over-evaluate the socially oriented personality as the 


show that the analogy with "selling" has been pressed too far. True. à 


І З ess and ability to meet people is an essen- 
tial for getting respondents to consent to the interview and to answer 


questions willingly. Survey agencies are not likely to hire people for 


жуы: .) СО NOt possess at least this minimum degree 0 
sociality.” Beyond this point, however, there seems to be little relation 


other hand, the respondent was completely “sold,” rapport was © 
cellent, but there was evidence that the respondent was aware of the 


ing her answers, The kind of situation which the salesman attempts t° 
produce may be precisely the one which is least suitable for the accu- 
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rate measurement of opinion. And the interviewer who is most adept 
at producing such situations may be as unsuitable for the interviewing 
task as the one who encounters too many refusals. 

Other evidence was presented in chapter II to show that the respond- 
ent is often much more detached from the social aspects of the inter- 
view situation and from the personality of the interviewer than he is 
usually considered to be; and that the interviewer himself usually has 
a kind of professional task orientation which enables him to preserve 
“objectivity; that interviewers themselves regard over-involvement in 
the interview socially as a fault to be avoided, and that interviewers as 
à group show less “sociality,” as measured by the inclination to discuss 
Personal problems with others, than the general norm of college- 
educated women with whom they may be compared. 

Some general conclusions of a tentative nature emerge. Over-all 
skill, in the various phases of the interviewing task (getting respond- 
ents to answer easily and truthfully, recording answers accurately, and 
sampling efficiently) show a fair degree of association. However, each 
element of the job requires social skills and other abilities—carefulness, 
Judgment, intelligence, etc.—in varying proportions, and these under- 
lying skills, particularly the social and nonsocial, do not appear to be 
closely related. н 

ће implications for the survey agency are that the current practice 

of Tejecting applicants who are markedly lacking in either ability to 
approach people or ability to understand and follow instructions, and 
ll out questionnaires accurately is a sound one; but also that caution 
should be exercised in having interviewers who are excessively socially 
Oriented, In order to apply these findings efficiently, these skills and 
traits need to be measured. Hence we need to know how they are 
related to other more easily determined characteristics. If we can find 
Correlations between skills and independent variables, such as test 
Scores or interviewer characteristics, we would have some basis for the 
Selection of good interviewers within the limitations imposed by zu 

Wer labor market conditions. Before taking up this question, how 
Sver, We need to examine the relationship of interviewer performance 


in 3 Hs eS Я 
the routine tasks to his biasing tendencies. 


Correlations Between Routine Skills and Biasing Behavior 


е е А.].С. study described earlier also provides some data on the 

È ‘tionship between performance in the routine interview tasks—ask- 

uc Questions, probing, and recording answers—and biasing behavior in 
© Interview, 


284 Interviewing in Social Research 


Measures of biasing behavior were computed, based on a subjective 
evaluation of each error occurring ona Negro, Jewish, or Authori- 
tarian item to determine whether the error was of a nature to influence 
the direction of the respondent’s reply, or to distort his answer in the 
process of recording. Any error which seemed to increase spuriously 
the respondent’s apparent pro-Negro, pro-Jewish or anti-Authoritarian 
attitude received a value of 1 to 3, depending on the estimated о 
tion potential. Errors tending to bias toward anti-Negro, oet 
or pro-Authoritarian attitudes were scored —1 to — 3 similarly. In addi- 
tion, comments of the interviewer in his conversation with the re- 
spondent were examined and scored for bias in the same fashion, de- 
pending on direction and distortion potential. E 

However, in correlating biasing behavior with errors of the various 
kinds, the direction of bias was ignored and the scores on Negro, Jew- 
ish, and Authoritarian items were added together. Correlations of this 
total arithmetic bias with errors are shown in Table 69. 


TABLE 69 
CORRELATIONS оғ TOTAL Амтнметіс Bias iN. A.J.C. Srupy 
with Various Kinps or Errors 


With asking errors. 
With probing errors. . 
With recording errors. , 
With cheating errors. . . 
With total errors 


Since each kind of error includes biasing as well as neutral errors, the 
correlations with biasing errors would "d be very meaningful if they 
were high. Correlations of biasing errors of each kind with neutr? 
errors of the same kind would have been more interpretable. However: 
the fact that the correlations are so low in spite of the procedure use 0 
indicates virtual independence between biasing and neutral persi 
This result is rather surprising, since we might have expected that thos 
interviewers most affected by the strain of difficult interviews WOU а 
have made more errors of both kinds than interviewers who could v 
main more detached from the situation. However, the intercorrelation? 
in Tables 66 and 69 suggest that the reactional effect of the responde, 
on the interviewer is not uniform across all aspects of his work ОГ th 
the interviewer does not have a generalized error tendency. i 

Somewhat different results were obtained by Guest and Nuckols r- 
their laboratory experiment using three electrically transcribed an 
views concerned with labor-management relations.“ Answers W° 
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prearranged, one respondent giving predominantly pro-management 
answers, another predominantly pro-labor, and a third answers which 
Were about neutral. The subjects were twenty-four college student 
interviewers who had had a small amount of experience in public 
opinion studies. The questionnaires filled out by those interviewers 
from the transcribed interviews were scored for errors in the direction 
of management, errors in the direction of labor, and neutral errors. A 
fairly high correlation, .52, between the number of biased errors and 
the number of neutral errors was obtained, indicating that interviewers 
who made more neutral errors also tended to make more biasing errors. 
The biasing errors, however, tended to cancel each other, as is shown 
by the low correlation of .13 between number of biasing errors and net 
resultant bias. 

In this same study, the correlation between the direction of bias 
(pro-management or pro-labor) and interviewers' predispositions in 
favor of management or labor as measured by the Leaman labor-rela- 
tions scale, was only .19, indicating that the biasing errors were not, for 
the most part, attributable to the interviewers’ own predilections. These 
results, taken together, suggest to the authors that biased errors, at least 
those which arise in the process of recording, are really random cleri- 
cal errors.” This conclusion is in accord with the theory of interviewer 
bias set forth in chapters П and VI, where the interviewer was de- 
scribed as essentially task-oriented, and error was traced not so much to 
the concern of the interviewer with the substantive content of the re- 
Sponse as to the difference in judgment, and in the perceptual frame of 
reference of interviewers іп coding responses oF in selecting what parts 
kd the answers to open questions should be recorded. In this view, the 
Main sources of bias are misunderstanding of instructions; mistakes in 
Judgment of equivocal responses; idiosyncratic definition of his role by 
the interviewer himself, proceeding from his own beliefs as to the 
nature of attitudes and of respondent behavior; and nonobservance of 
Prescribed procedures when situational pressures are strong. 


Since at least a substantial part of the biased errors occurring in the 


i i 
interview seem to arise from the same set of causes that produce errors 
he basis of skill in the 


i general, the selection of interviewers on t Ш in t 

Utine tasks of the interview should also have the effect of minimizing 

t least one of the determinants of interviewer bias. 

i he relation between expectational or stereotypic tendencies and 

е skills has пог, to our knowledge, been thoroughly ig 

ex Ough some evidence will be presented later on their association wit 
Регіепсе and with validity in general." 


286 Interviewing in Social Research 


Correlations between Skills and Independent Variables 


Menefee lists as some necessary qualifications of good interviewers: 
stability, honesty, and dependability, ability to meet people, intelli- 
gence, interest in the work, objectivity, and experience. Many more 
have been suggested by others. | 

While these qualifications may have some empirical basis in the 
cumulative experience with field investigations, they cannot have the 
weight of generalizations based on experimental study of the problem 
over a wide range of interviewing conditions. This can be clearly 
demonstrated in the wide variability in the qualifications recommended 
in the literature. Years ago, Cavan tabulated the suggestions of thirty- 
eight different investigators writing in the decade of the twenties on 
the common subject of the good interviewer. The maximum agree- 
ment was on one trait which nineteen of the writers mentioned. In all 
other instances, traits mentioned by any writer were omitted by the 
majority of the other writers. With respect to one trait, "sympathetic 
attitude toward the respondent," there is actually a complete contra- 
diction in the suggestions, with almost equal numbers recommending 
and opposing the presence of the trait in the good interviewer. 
Cavan's tabulation is reproduced in Table 70, below. 

It seems that different past writers may either be sampling different 
types of interviewing behavior in establishing the correlates of per- 
formance or may have no Objective criteria by which they have deter- 
mined the correlates. However, it may be that the different writers are 
talking about different kinds of interviews, 

Attempts to establish objective criteria of interviewer competence 
and the correlates of such competence were made by Guest and by 
Guest and Nuckols in the two studies referred to in the foregoing. In 
the first of these,? fifteen College students interviewed the same 
"stooge" respondent. The interviews were transcribed from conceale 
wire recorders, Performance of the interviewers based on number 0 
errors of recording, question wording, omission of questions, failure 
to probe, etc., was correlated with their scores on the Bernreuter Per- 
sonality Inventory, the Moore-Hill College Aptitude Examination, an 
the Strong Vocational Interest Blank. Few of the positive rank-order 
correlations were high enough to be of predictive value. Such ре 
sonality factors as emotional stability and dominance showed negative 
correlations with interviewing skill. Total score on the college aptitut? 
test showed a positive correlation of only .11. A few fairly high corre 
lations with some occupations on the Strong Vocational Interest Blan 
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were found. Guest suggests that these might be used in combination 
with each other and with aptitude test scores to develop a multiple 
predictor or test battery of high value. 


TABLE 70 


T 
HE QUALITIES AND ATTITUDES OF A SUCCESSFUL INTERVIEWER SUGGESTED BY Tuirty- 
EIGHT DIFFERENT INVESTIGATORS 


No. of Times 
Mentioned 

Тате knowledge in the field of investigation 5 

road general knowledge.......... 010010 2 

Tevious knowledge of the interviewee. ......... ss пе Д 
p interviewer should be organized emotionally, should understand him- 

ska жй i ЖЫЗ miles Heo sup divis pU nie ore aqq popa FI uis BAT Quirites dS. x 5 

Good personal appearance, pleasant manner, well-dressed. ..... et 5 
cule or talk 

Ne а ашса ыйы каакы HEE HE BEE Ge CRETE TE SAS 19 

13 


hasis on misdeeds of 


nemotional, never feel surprise or shock. .... 5° 
I €sponsiveness to interviewee, never bored....--+++ 
mpartial, unprejudiced, <. ssie ee oms e ner terne a= 
Cineni jon р stener, give interviewee complete attention 
Выш ities, mentioned by only опе ог two persons: най 
» drive, perseverance, humor, patience, jollying, cheerfulness, 
punctuality, courage, business-likeness, ease in talking 


"ied the more recent laboratory study by Guest and Nuckols,”® 
knit college student interviewers were first given zb of 
Span ard tests, including an auditory number-span test ап Fes 
Sian test, an abridgment of a labor-relations scale develope y i iam 
TEE the Minnesota Clerical Test, the Guilford-Martin Personne n- 
‘entory I, and the Wonderlic Personnel Test, this last being con- 


Side а 6 М 9 E 
Ted to measure academic aptitude or intelligence. The subjects 


Wer 5 : М А 
© later tested for accuracy in recording three recorded interviews 
s, with prearranged an- 


con ‹ p 
See with labor-management relation 
TS developed by the authors. Correlations between total number of 


erro ) 
тз and the various test scores are shown below. 


he most positive results of the study are the indications that the 


More į . А 
intelligent interviewers are less likely to make errors, as shown 
he Wonderlic test and 


t : : 
C Negative correlation of .55 between t 
error score. Since scores on the auditory number-span test showed 
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a correlation of only .02 with error scores, the authors reason that it is 
not memory-span, but some other aspect of intelligence that is re- 
sponsible for the better performance of the more intelligent inter- 
viewers. Whatever the reason, there is a strong suggestion to select 
intelligent interviewers—even at high cost—‘“but mere college educa- 
tion is no guarantee of the intelligence needed.””” 


TABLE 71 


Corretations BETWEEN Test Scores лхо Tota, Num- 
BER OF Errors IN Guest-Nuckois EXPERIMENT 


Minnesota Clerical Test; из soni жез гз йз, .08 
Wonderlic Personnel Test (intelligence) . . . . .. $$ 
Guilford-Martin 
OD EGC Was ans wise y cree aa RES eA S 12 
Apreeablenes sisi. nas # etes cers soo ана dele 32 
Gosoperstiveness.«; cusa zem жый zug dans —.06 
Auditory Number-Span Test (memory)....... :02 


The only other statistically significant finding is the positive corre 
lation of .32 between agreeableness and error. The authors suggest that 
“agreeable” interviewers may record extreme viewpoints in a less €x- 
treme category or use less forceful words when recording free-T@ 
sponse answers, leading to biasing errors, or that they are just wi 
careful generally. If we consider “agreeableness” as related to 50С!@ 
interest, this finding is in accord with the apparent negative association 
between social skills and other interviewer skills mentioned earlier. The 
personality factors of objectivity and co-operativeness show little rela- 
tion to errors in recording. 

In a recently published study by Herbert Fisher, test scores of 1e 
cording ability (determined by reading a dull passage and having the 
interviewers record as much of it as they could) were found to be а 
good measure of ability to record responses in an interview situation. 
The good recorders—those interviewers who made good scores on the 
test—recorded a consistently larger proportion of the responses Bs 
subsequent laboratory interviews, with the author acting as respondent. 
Furthermore, the poor recorders showed a slightly greater tendency t° 
select responses which were in accord with their own opinions, but 
this difference does not approach statistical significance. 

A large-scale analysis of the differential performance of vat 
types of interviewers, according to their factual characteristics, 
made by Sheatsley in his study of the interviewer labor market. 
examined the quality of the work and stability (length of time on st ‘4 
of 1,161 present and former NORC interviewers.” Quality of work } 


jous 
was 
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based on median over-all ratings of each group on a five-point scale 
ranging from 1.00 (poor) to 5.00 (excellent). The three components 
of the over-all ratings, as mentioned before, were free-answers, clerical 
performance, and sampling performance. 

The median rating for all 1,161 interviewers was 3.06, but the rating 
for those on the current staff averages much higher (3.62), reflecting 
the process of weeding out of the interviewers with poorer perform- 
ance. Table 72 gives some results of the analysis for a number of the 
factual characteristics. 

Summarily stated, the salient findings are: 

Sex and marital status: Women had better average ratings than men 
(3.12 against 2.95), and the married women were superior to single 
Women (3.15 to 2.91). Furthermore, the married women remain longer 
on the staff than the other groups. 

Age: The 30-39 age group showed up best on both ratings and length 
Of service. Below 25 and over 50, the quality of the interviewer's work 
below standard, and the younger age groups also had higher turn- 

ver, 

Education: College-educated interviewers achieved somewhat higher 
than average ratings, though the differences are not statistically signifi- 
Cant and are offset by lower turnover of the less-educated group. — 

Field of study: The college-educated interviewers who majored in 
Psychology, sociology, or anthropology received the highest ratings, 
followed by those trained in one of the physical sciences. Fine-arts 
Majors received the lowest ratings of all, while those trained in busi- 


hess, humanities, or law also received inferior ratings. 


Outside employment: Interviewers with full-time jobs in other work 
d length of service. Interviewers 


bie below average on both ratings an terviey 
Tiployed part-time on other work also were below average in ratings, 
though not in longevity. | 
Length of past job experience: Little relation between this factor and 
on ratings or longevity was noted, except that those with ло past job 
€xperience did obtain somewhat lower ratings. | 

е Ty Pe of job experience: Surprisingly; interviewers whose past job 
*periences involved least contact with the outside public, e.g., office 
etc., averaged highest in the 
d been in jobs involving ap- 
salesmen, reporters, social 


рит work, medical technician, 
Mol While those whose experience ha 
Sie or persuasion of other people, €.» NE 
wh Ts, etc., had the lowest average ratings. In the middle, nos 

: Ose jobs involved considerable contact with the public, but little 
Pproach or persuasion—salesgirls, etc. Sheatsley points to the varied 


TABLE 72 


РекғокмАМСЕ ОЕ NORC Interviewers As RELATED To Personal. CHARACTERISTICS 


Percentace Raten ABOVE 


ande Meus AVERAGE ON 
VERAGE n 
“ow Sarr | OVERALL | prees | Clerical | Sampling 
* Answers | formance | formance 
Allinterviewers............... 7.98 3.06 35 33 30 
Сагай... cens гь ssn ai ais 25.20 3.62 50 48 34 
Mni sso eei cuit Ez aer wee aes an 5.08 2.95 32 34 2 
MYCHIen e Uc: ted inia cere 8.32 3.12 35 32 0 
Single women................. 6.23 2.91 35 31 22 
Married women............... 9.71 3.15 35 32 33 
4.79 2.68 27 24 20 
4.65 2.98 38 39 gi 
7.38 3.13 39 38 2 
9.40 3.20 37 32 2i 
11.42 3.04 35 3 25 
7.70 2.91 28 21 26 
Some graduate work......... 7.28 3.20 39 35 35 
Completed college. . . 7.48 3.17 40 30 А 
Some college... 8.44 2.99 35 35 i 
REINS: ауса ue 10.06 3.00 28 29 2 
6.40 3.33 48 36 2 
7.12 3.09 40 27 28 
7.62 2.99 45 28 ? 8 
6.70 3.22 38 36 1 2 
7.03 2.99 35 29 т 
й 28 
Employed fulltime............. be n 34 30 a 
Employed parttime............ 9.12 2.99 40 31 ^ 
No other employment. ......... 8.75 3.12: 35 3 ee 
Past job experience: 

RNG ie most е аныч. 8.70 3.00 29 34 E 
Less than 2 угз.............. 6.82 3.11 42 34 2 
2-5 угз....... 8.35 3.12 38 37 А 
Over 5-10 yrs. 5.77 3.06 35 28 3 
Over 10 yrs....... e wie aa уз 9.04 3.07 33 27 A 

Experience with job: 
CT M 8.45 3.16 38 30 35 
Involving approach, persuasion 7.66 2.96 38 30 + 
Involving public contact but 

little approach, persuasion... 8.95 3.05 31 28 Ff 
Involving no public contact. . . . 8.10 3.17 36 38 : 

Type of past interviewing ex- 

perience: 46 
Student, academic surveys..... 7.60 3.38 44 44 6 
Other opinion research....... 10.10 3.17 36 43 2 
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TABLE 72—Continued 


PercenTaGe RATED ABOVE 
Nummer Nee! AVERAGE ON 
е |" VERAGE 

or Мохи | будан, | Free. | Clerical | Sampling 

Ravine || Baen д е 
Consumer, market research. ...| 10.16 3.09 34 36 28 
M ormal unscientific surveys... 7.55 3.00 35 27 27 
S © past experience. .... 7.64 3.05 36 30 35 

upervision: 

independent interviewer . 8.64 3.05 35 31 31 
Ssistant to supervisor... ...- 7.00 3.17 32 40 26 


nature of the interviewer’s job as the explanation: “The group experi- 
enced in approach and persuasion, for example, averaged well on *free- 
answer’ performance, but fell down slightly on the clerical and sam- 
pling aspects of the work, while those with only clerical or allied 

perience carried out the last two aspects of their work in a superior 
manner," 

Ty be of past (pre-NORC) interviewing experience: Interviewers 
experienced in student or academic surveys at college achieved the 
highest ratings of any experience group (3.38), but those experienced 
i other opinion survey organizations also earned better-than-aver- 

© уа һауе lower turnover. 
those inf. Interviewers whose wor 
their a e large cities) obtained highe: 
ен не ен from the central office. 
dii clerical work, an expected finding, sinc 

eias are most easily verified by the supervisor. . 
Public 5 of these findings will not be unexpected to those in the field of 

: pinion or market research, but they are useful in providing an 


objecti ie А : 
J€ctive confirmation of long-held opinions and impressions. Others 


urni E P 3 F : 
nish new evidence on hitherto disputed questions, such as the evi- 
to be an advantage 


en s k t 
rather that experience with other agencies appears 
Ed ave held. Still others completely 


— vidence that those with prior 
r interviewing work 
Yet the differences found are 
haracteristics has in itself 
since no group shows 


а s 
E average rating of better than 3.38. In this sense, the study may be 
Nsidered inting, but Sheatsley reminds us that 


k is directly supervised (mostly 
r ratings than those receiving 
This is largely attributable to 
since the clerical aspects of 
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interviewing is a complex of many different skills and cites se ie 
Guest studies, already mentioned, to show that other investigato у ve 
had difficulty in finding factors related to even one isolated аре bs 
interviewing skill, such as recording ability. Moreover, if on bt 
factual characteristics are combined, the chances of successful pr we 
tion are increased. He states: *We find, for example, that vean 
aged 30 to 50, with some past opinion or market research €— : d 
experience, achieve average ratings of 3.29 and remain an averag' di 
14.9 months on the NORC staff. These are a great deal better wp 
averages for all interviewers, and a staff hired merely on prb Бе 
would be expected to perform with above-average skill, all = vies 
tors being equal." Sheatsley concludes with the suggestion e Pes 
operative research in the development of new and more appr geri 
tests offers the best prospect of success and emphasizes that due | 
must measure not only skills, but also job motivations, attitu 
research, etc., if they are to predict total performance. т" af 

Another extensive experimental attempt to find the correla = а 
good interviewer performance is reported by Keyes.” The Pina 
forty-five interviewers employed on a community survey of bong 
by the Opinion Research Center described in chapter VI wet A 
subjects of the experiment. The interviewers were given seven Lp 
logical tests, and their test scores together with interviewer ta p 
data were compared with survey performance as judged vom en 
number of “DK” responses, ratings of adequacy of responden 
swers, evidence of probing, and completion of assignments. 

The major findings are summarized below: 

Factual characteristics: фай 

1. Education—College graduates showed higher competence pà 
the interviewers with some or no college. Those who had E 
training in public opinion theory showed outstanding performance "gn 

2. Experience—Interviewers who had worked on twenty-five —— 
more surveys achieved better scores than those with less or no €XP 
ence. 

3. Sex—Women obtained higher competence scores than men. 

4. Age—The 35-44 age group were most competent. дүй 

Personality: A tendency to introversion and low social adjus 
was associated with superior performance. ; 

Interests: Aesthetic and theoretical value orientations were aper A 
with better performances, while interviewers whose values ds 
chiefly economic, political, or religious were inferior. ^ In term 


ated 
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occupational interests, those interested in literary pursuits did best, 
while interest in “persuasive” occupations was associated with lower 
competence, 

Intelligence: Somewhat superior performance was shown by those 
who obtained high scores on the California test of mental maturity. 

Clerical ability and recording ability: Clerical ability, as measured by 
the Minnesota Clerical Test, and recording ability, as determined from 
tests constructed especially for this study, were both somewhat re- 
lated to superior performance. 

The study was not successful in finding psychological tests of high 
Predictive value. Correlation co-efficients of the test scores with per- 
formance criteria were all too low to insure confidence in predictions 
made from the scores. Furthermore, some of the relationships previ- 
ously cited may be spurious or confounded, since the partial association 
Ог correlation between the test variables and factual characteristics and 
the performance scores are not available. Nevertheless, the general 
Profile which emerges of the better interviewer as female, 35-44 years 
old, Possessing superior education, experience, and intelligence, with 
"troversion tendencies is in general agreement with the findings of 
other investigators already cited. It will be remembered that the Guest 
Studies showed a high positive association between intelligence and 
interviewing performance, with a suggestion of a negative correlation 
es social orientation and performance, and that the Sheatsley 
abor market study found that women, those in the 30-40 age group, 
those with superior education, and those whose background was in the 
Non-persuasive occupations obtained better interviewer ratings. — 

A study by Taft gives support to this somewhat paradoxical finding 
of a relation between social tendencies and poor performance and 
Provides insight into the dynamics involved. Taft studied the corre- 
ates of ability to judge or rate both the traits of other individuals and 
the proportion of a group which would collectively show certain 
traits, The correlates were determined for a group of forty male 
сше students on the basis of an elaborate three-day pe 
ae program. Such specific findings as the following be ob- 
Mi : The physical science majors were superior to social science 
ot ‚ CDS. There was a moderate positive correlation between accuracy 
mo "98 тепе ана "carefulness.? The good judges were d cepe 
. 9I€ aler А ‘ ed, and serious. The poor 
Judges е4 наро. н, mea and imaginative. The = 
Judges eo tan si M on on-ori ted. They possessed 

Were task-oriented rather than person-orien yP 
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an “organized, socially passive, serious, unemotional and realistic per- 
sonality.” Taft concludes that: “. . . the good judges of others аге 
extraceptive persons possessing a hard headed judging attitude . . - 
while poor judges are intraceptive persons who view other people in 
terms of their relationship with themselves; they are socially dependent 
and err in the direction of being over-generous.” 

While these findings bear specifically on only that component of the 
interviewer's task involving judgment or rating of traits, they seem 
germane to the larger findings reported earlier, and they suggest that 
objectivity in other realms of performance may also be jeopardized by 
excessive sociality. 

Additional confirmation of this general finding is available from an 
exploratory study done under widely different conditions. A group ° 
ten interviewers listened to a transcription of an interview, took notes 
of the contents, and later wrote a report of the interview. Their reports 
were rated by two independent judges on clarity of expression, organi 
zation of the material, completeness of recording of details, and free- 
dom from distortion. The interviewers were also rated on their tend- 
ency to be “person-oriented” or “content-oriented” (analogous to oul 
concepts of social vs. task involvement), as determined by judges m 
ings of the comments and evaluations the interviewers were askec to 
make on the technique used in the transcribed interview. Correlation? 
of the skills with type of orientation revealed a negative association 
between person-orientation and skill. What is again suggested by ie 
data is that too great a social orientation in some manner interfere 
with the performance of the more routine duties of interviewing.” 


Relation of Experience to Interviewer Effects 


There is considerable disagreement in the survey field concerning 
the effect of experience on interviewer performance. Many researc? 
workers claim that the improvement in skills and understanding which 
comes with experience is offset by greater knowledge of short cuts an 
cheating practices and development of idiosyncrasies of interviewing: 
There is a general tendency to hire inexperienced interviewers who € 
be more easily trained in the research agency's particular technique 
and procedures. j 

The factual evidence available does not settle all the issues in s 
Controversy, especially since current measurements of performante" 
rely largely on the evidence which appears on completed question 
naires and do not demonstrate what actually goes on in the intervie’ 
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Nevertheless, studies relating experience to various aspects of inter- 
viewer performance deserve some attention in any consideration of 
desirable interviewer characteristics. 

The most comprehensive examination of the relation between ex- 
perience and performance is again found in Sheatsley’s study of the 
Interviewer labor market. Table 73, reproduced below, shows how 
NORC interviewers’ ratings changed with the length of time on the 
Staff. A simple comparison of ratings of interviewers with various 
lengths of experience would not answer the question, because selective 
firing and resignation tends to weed out the poorer interviewers in 
ume. Therefore the table compares the ratings over time of the same 
Interviewers, 

We sce from the table that (1) the ratings for each group showed 
Consistent improvement over time, with the single exception of the 


fifth and later years, when there is a slight drop; (2) interviewers who 


remained longest on the staff turned in the highest first-year ratings, 
and the longer-lived interviewers received consistently higher ratings at 
equivalent points. 


TABLE 73 


Mepraw Annuar Ratincs or NORC INTERVIEWERS 


i Third | Fourth |Succeeding 
N NUR yd Year Year | Years 
a | "59 | 5-1 
A interviewers: 
ating for first 
Nteryj. VA) NEN 932 3.04 
€rviewers who lasted more 
than one year: 
Iteag in first two years,...| 369 | 329 332 
t tewers who lasted more 
Rigen two years: 
Interyine in first three years...| 192 3.33 3.53 3.65 
tl ee who lasted more 
van three years: 
ling infirstfouryears....| 115 | 338 | 3-53 373 | 3.82 
tl BE who lasted more 
ing ;OUT years: 
Rating in each yém..nuuu OF Ame о 3.82 | 406 | 3.88 


As Sheatsley says, the findings "cast grave doubt on the hypothesis 


= interviewers do their best work early in their mcg | and E 
А to lose interest or to grow careless. On the contrary, t m is, ~ 
Most part, a steady though not sensational improvement trom y 


to 3 > 
ear.” Thi ; who remain on 
Year." This seems true enough for the interviewers 
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the staff a long time, but it may be accounted for by loss from firing 4 
resignation of interviewers who do not improve or whose puso 
deteriorates. In other words, those who remain on the staff are much 
more likely to be those interviewers who for one reason EX 
sustain their interest so that they are able to profit from мари 
is clear from the table that they were the better interviewers from : s 
beginning. Sheatsley gives the median second-year rating of 3.11 ^n 
those who lasted only two years compared with a median of 3.53 a 
those who lasted more than two years, with the median for the 2a 
group of 3.32. Examining the median first-year ratings, it seems er 
that this must have been higher than 3.11 for those who were to E 
only two years. Apparently those interviewers who last only two уе? : 
do not improve in their second year, but actually receive poorer km 

From Table 74 below, it appears that an interviewer's work s 
his first year on the staff is a pretty reliable predictor of how he w! 


TABLE 74 


RELATIVE PERFORMANCE BY Groups 
(NORC Interviewers) 


Fiırst-Y ear RATING 


Seconp-Year Rating 


Miet | ree | Pag 
Average Average Average 
Below average 34 17 
Average...... 26 21 
Above average 40 62 
MER 
100 100 


N = 137 N = 100 N = 132 


TABLE 75 


LENGTH or Time on Starr BY First-Year RATING 
(NORC Interviewers) 


F 
Lenora or Time on STAF! 


tage 
First-Year Grape N Percentage Percentage Pier Two 
Less Than One to Years 
One Year Two Years 
6 
33 82 12 19 
104 63 18 30 
100 43 27 27 
84 53 20 23 
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in his second year. This is perhaps the most important finding. As 
Sheatsley says, "It now appears that if an interviewer is not turning in 
satisfactory work at the end of the first year, the money spent on 
educational correspondence or personal re-training had better be spent 
on the hiring of someone else.” 

Fortunately, most of the poorer interviewers do not remain long on 
the staff—82 per cent of those with poor ratings in their first year 
remain less than one year and only 6 per cent of them stay more than 
two years. On the other hand, interviewers receiving the very best 
Tatings at the start do not remain as long as those with “average” rat- 
ings, probably because of the competition of better-paying jobs. 

We have been discussing the relationship of performance ratings to 
experience with NORC. In terms of prior experience with other agen- 
cies, the picture is somewhat different. We cited earlier the slightly 
Superior performance of NORC interviewers with some previous ex- 
Perience in interviewing with other agencies. However, those with 
Very long prior experience—over five years—show much poorer-than- 
average ratings; the differences shown in Table 76 between the distri- 


A TABLE 76 
VERAGE Rating or NORC INTERVIEWERS BY Prior INTERVIEWING EXPERIENCE 
n Ге [асана | ео 
Average i Average 
No А : 
Past interviewing experience 430 48 17 35 
Up to 6 mos. past | 39 43 18 39 
0798-2 yrs. past BH, кат 103 42 20 38 
ы 2 yrs.-5 yrs. past“ count 70 41 22 37 
SF 5 yrs. past experience. ......- 45 54 27 19 


buti Я : —Á 
tion of interviewers with over five years prior experience over the 


E 
Broups below average, average, and above average, and the correspond- 
ficant at the 5 per cent 


terviewers with a 
find it difficult to 


Vidence of superiority of experienced interviewers in obtaining 
ple answers on open-ended questions is available from unpub- 
баа from the NORC Denver Community Survey described 
tö ere in this report. In this study, nine interviewers were assigned 

each of five sectors, with assignments in each sector randomized. On 
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all four open-ended questions shown in Table 77 below, a higher per- 
centage of the experienced interviewers (those who had worked 
previously on seven or more surveys) were among the top three in 
their sector in number of answers obtained. 


TABLE 77 


Tue RELATION or EXPERIENCE To Авплтү то OBTAIN MULTIPLE ANSWERS ох OPEN- 
Enpep QuzsrioNs 


Percentace FALLING 1х Tor 
THREE IN SECTOR 


Question 
Experienced | Inexperienced 
N=19 N = 26 
42 27 
42 27 
47 23 
42 27 


It seems that these data can be interpreted in terms of greater prob- 
ing skill for the more experienced interviewers, Evidence tending 10 
the same direction, although not Statistically significant, is available 10 
the results of the experimental measurement of interview probing skills 
ina laboratory situation described in Chapter VI. Of the sixty-one 105 
terviewers who participated in the experiment, thirteen might be 
called inexperienced—arbitrarily defined as those who had worked ОП 
less than nine surveys for NORC. The “probing tendency" score 
measuring tendency to probe answers which should be probed, a eid 


aged 93 for these thirteen against a score of 103 for the remaining 7 
terviewers, 


In the Denver study, 


r . У idi of 
it was possible to determine the validity 
respondent answers on a 


number of characteristics from outside ei 
ords. Table 78 shows that, on two of the three items validated, the 6 
perienced interviewers obtained results of greater validity, while on t e 
third item, the difference is negligible. 

When the Chi-squared values for the three items are pooled, hes 
sults are significant at the .02 level. 

How experience develops in interviewers an ability to ge j 
answers is not revealed by the study. It should be noted that inexp " 
enced interviewers in this study, though lacking field experience, а 
taken courses in interviewing and other phases of survey method. d 
А Suggestive evidence was reported in chapter V that inexperion? 
interviewers are more likely to introduce interviewer effect in t d 
classification of responses into pre-coded boxes because they woul 


t valid 
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have greater need of the aids furnished by unconscious biasing tend- 
encies in simplifying the task of classifying answers.” 

In their experiment on attitude-structure expectations described in 
Chapter III, Smith and Hyman tested the hypothesis that inexperienced 
Interviewers would be more prone than the experienced to allow their 
expectations based on the whole attitude structure of the respondent to 
influence their coding of respondents’ answers, owing to insufficient 


TABLE 78 


Tur RELATION or INTERVIEWER EXPERIENCE TO INVALIDITY OF Кеѕ01т5 


Амохс Interviewers Мно ARE 


ExPERIENCED INEXPERIENCED 


Percentage Who Fall into Groups Shown 


N=19 N = 26 


о ; — 
Wnership of driver's license 


hn the upper three in amount of invalidity 
n the middle three. . . 


In the lower three 


B Á ibuti i 
“tsonal contribution to Community Chest ^ 
in the upper three. 5 9 
n the middle three. . 5 E 
A the lower three... o. cesset = 
100 100 
| Ж рс 


Маке: 
p 1n 1948 presidential election " 51 
Ta che UpBehthree, .... oie em erp omnes mea 
nthe middle three. . ДЫ 
n the lower three 


training or lack of conscientiousness.^ In this case, а phonograph 
transcription of an interview with a respondent whose attitudes were 
Predominantly isolationist was used. In Table 79 below, we see that, 
«n both the questions tested, the inexperienced subjects had more in- 
Correct codes and seem more likely to code in terms of expectation 
effects, but the differences are not statistically significant so that no 
1 nite conclusions сап be drawn. 
П Chapter VI we cited the finding 


erii à À 
z^ enced student interviewers were ! ae 
WS than the experienced interviewers of two pr 
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polling organizations. Their lesser ability to overcome respondent G 
sistance resulted in more refusals and a higher proportion of “Dont 
knows” on completed interviews, although it was not clearly demon- 
strated that the less experienced interviewers recorded opinions, pistes 
ences, or facts significantly different from those of the experience 
interviewers. However, the fact that the inexperienced interviewers 


TABLE 79 


i THE 
Tue RzLaTION оғ Experience To Expectation Errects As SHowN BY Соріхо OF 
Isotationist REsPoNpENT's КЕРІЛЕЅ* 


, тн 
PERCENTAGE AMONG INTERVIEWERS W 
с 
No Experience One Year or Mor 
N =33 N = 36 


Interest in Spain 
Take no interest in policy toward Spain (incorrect) . 29 16 


Some interest (correct), other codes . 71 8% 
Weis AES арра d MEE cd 
100 100 


*$ te Chi- i ted Chi- 
square yields RN iecit yield P-values of .28 and .16, and a combined test based on the agere? 


i fe cina *e Hi ce 
had higher non-response rates is significant, because this differen 


might lead to differential biases in other cases where the characteristic 


being measured were more closely related to differential tendencies t° 
respond. 

All the studies just mentioned (and the Keyes study cited earlier 
have Shown some tendency for the experienced interviewers (0 
superior, either in one or another aspect of interviewer competence = 
in the avoidance of bias. It is true that an earlier study by Cantril, 9, 
cussed in chapter V, found no relation between experience and a 
but the inexperienced interviewers actually had participated or an 
average in ten surveys and therefore may not furnish a real test sa 
effects of inexperience. 

In summary, the weight of the evidence supports the conclusio à 
We may expect superior performance from the more experienced à 
viewer. Two qualifications should be made, however: 

1. Any apparent superiority of experienced interviewers may be 


n that 
nter- 


due 
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as much to selective turnover (the better interviewer generally re- 
mains longer on the staff) as to the beneficial effects of experience it- 
self, Whatever the reason, the length of experience still seems valid as a 
predictor of performance. 

: 2. It seems that the research agency should be cautious about hiring 
interviewers with particularly long experience with another agency, 
but this should obviously depend on the degree of similarity of the 
Work of the two agencies. 


Correlation of Bias and Independent V ariables 


Very little information on the relationship between biasing tend- 
encies and other interviewer characteristics is available. We have al- 
ready cited some suggestive evidence that experienced interviewers 
may be less likely to bias results. In the Guest-Nuckols study already 
described, the number of biased errors of recording in an artificial 
interview situation were compared with psychological test scores for 
twenty-four college students. Errors were scored by the judges as ina 
pro-management direction, pro-labor direction, or neutral. The excess 
of errors in one direction over errors in the other direction was divided 
by the total number of biasing errors to obtain a resultant Dias index 
(net bias). The correlations between these bias measures and test char- 
acteristics are shown in Table 80 below: 


TABLE 80 


RELATION or Bias To VARIOUS INTERVIEWER CHARACTERISTICS 


CORRELATION WITH 


‘Test CHARACTERISTIC Total Number | Resultant 
Biased Errors Bias Index 
Ex m oem (Minnesota Clerical Test)... 04 —.08 
иша бщ Personnel Inventory: кк m 
greeableness. bans aene: 35 24 
Inc jj ОР°гЧуепевв T s^ v ced : : = 


gence (Wonderlic Test) . 


These results are not conclusive; the correlations of less than .35 are 


Not significant at the 5 per cent level. In their general direction, how- 
ever, they are corroborative of the persistent tendency we have noted 
9r superior performance to be positively associated with superior 1п- 
telligence, as shown by the negative correlations of intelligence with 
9th total number of biased errors and net bias, and for characteristics 
Which seem associated with social skills or social orientation, agreeable- 
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ness, or co-operativeness to be somewhat negatively associated with 
performance, although this relationship is not a strong one. Р 

Evidence on what variables might be used as predictors of tendencies 
to ideological or expectation biases is almost nonexistent. It might be 
expected that ideological bias would be most likely to be introduced by 
interviewers whose viewpoints are nearer the extremes. In the Guest- 
Nuckols study, interviewers were tested by the Leaman labor-relations 
scale, which had been shown to differentiate between persons who, be- 
cause of their background, might be expected to be pro-management oF 
pro-labor. However, the low correlations of .19 between scores on this 
scale and the direction of the net bias revealed little tendency for inter- 
viewers to record respondents’ answers to accord with their own point 
of view.?? 

The quantitative material presented in Chapter П, particularly d the 
phenomenological interviews, seemed to show that interviewers differ 
widely in their proneness to expectation effect. Some interviewers do 
not accept the notion of a consistency or unity of attitudes, and ар- 
parently this is particularly true of the interviewer who shows little 
“intrusiveness” or social orientation to the respondent, a fact which 
may prevent him from synthesizing his impressions. On the other hand, 
about a third of the interviewers said they could size up the respondent 
and predict his answers in advance half the time or better, an indication 
of role-expectation tendencies, and many interviewers reported using 

Contextual aids of a stereotyped sort” in classifying ambiguous an* 
swers. 

When interviewers were classified as stereotypic or non-stereotyP!¢ 
on the basis of the F-scale derived from the Berkeley study of author! 
tarianism, found to be correlated with stereotypicality, a larger al 
portion of the “stereotypic” interviewers reported in a subsequent 
questionnaire that they could Predict respondent answers half the time 
or better (44 per cent against 30 per cent for the “поп-ѕгегеосуріс 
interviewers). From Psychological studies of stereotypicality, ed 
might be developed which would be more efficient diagnostic indi- 
cators of tendencies to expectation biases, 

The sources we have cited thus far all attempt to relate interviewer 
performance to classical traits or characteristics. The individual corre 
lations found are too low to be very useful for selection purposes: а 
though a test combining a number of characteristics might be foun 
which would have good predictive value. The relative weakness s 
individual psychological tests for predicting performance is not unique 
to interviewing. Ghiselli found the same thing to be true of tests for 
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predicting worker’s performance in many other occupations, after ex- 
amining some 120 published references on the subject.” Furthermore, 
he points out that tests which may be useful for one organization may 
Not suit the requirements of another. 

Possibly a more fruitful approach would be found in the use of tests 
which do not attempt to find the correlates of interviewing skill as such 
but rather attempt to measure performance in a situation which simu- 
lates that of the interview itself. This quasi-interview situation may be 
so designed that some of the more important components of interview- 
ing ability and skill may be measured. A number of tests of this kind 
have been described in this report, though they were undertaken as ex- 
periments in interviewer effect rather than for the selection of inter- 
viewers. Comprehensive tests designed to measure freedom from bias, 
recording ability, and even probing skill and rapport in simulated inter- 
view situations would probably be very expensive and certainly would 
Not always be practical as a regular procedure in personnel selection, 
but under some conditions, they might be used profitably, perhaps sup- 
plemented by batteries of psychological tests. . 

There is some suggestive evidence that such performance tests 1n- 


volving a quasi-interview situation may be superior instruments. A 
f a “test narrative,” in which a 


number of organizations now make use 0 d 
fictitious interview is described in detail, with each question by the 
fictitious interviewer and each answer by the respondent written out. 
On the basis of these answers, the interviewers or prospective inter- 
Viewers taking the test fill out the schedule or questionnaire. This pro- 
cedure gives an opportunity to introduce knotty problems which will 
test at least the ability of the interviewer to understand and follow 
Complicated instructions and his accuracy in recording respondents 
answers. The Census Bureau makes effective use of such test narratives. 
In 1948, as a part of the pretest of the forthcoming census, the Bureau 
оны quality recheck of schedules in a few counties, using personnel 
Tom the central office to carry out the re-interviews. Comparison of 
test narrative scores with measures of field work accuracy of the 
original interviews, as determined by agreement with the quality check 
Te-interview, suggests that the test narrative may be useful as a pre- 

lctor of performance in the field, although, statistically, the sample is 
too small ang the differences too unreliable to constitute definite 
Proof. Researchers working for the British Social Survey report that 
they have found the “test narrative” approach useful in the selection of 
Interviewer, zs For the purpose of devising an upgrading scheme for 
interviewers of proven competence, the Social Survey used two tests: 
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one, a simple clerical test, the other, a series of dummy interviews with 
prepared answers (ie. interviews in which the informant supplies 
identical information to each candidate). By this means, the researchers 
report they found important differences between interviewers in cleri- 
cal ability and accuracy of recording, although admittedly they could 
not measure by this means alone, all the factors, many of them 1n- 
tangible, which go to make up the good interviewer. 

The Smith-Hyman study of expectation bias, described in Chapter 
Ш,* provided an instance in which performance in a quasi-interview 
situation could be compared with quality of work in an actual field 
survey. Proneness to expectation effects, as determined from a labora- 
tory experiment, was found to be somewhat associated with greater In- 
validity of results in the Denver Community Survey, in which inde- 
pendent checks for some questions were available in official records. 


Minimizing Bias Tbrougb Training Procedures 


Research agencies depend largely on careful instruction and training 
of interviewers in correct interviewing procedures for the avoidance © 
bias. These training procedures have been developed naturally вш 
experience and from the experimental studies of interviewer bias which 
have appeared in the literature, and the emphasis in training manuals r°- 
flects the prevalent beliefs as to the sources and locus of bias. Examina- 
Чоп of a number of the training Manuals currently in use by market an‘ 
opinion survey agencies discloses that the principal source of bias Б 
conceived to be ideological and that the locus of bias is considered e 
be chiefly in the process of asking questions. By contrast, biases arising 
in the process of recording respondents’ answers have received less 4° 
tention, and the operation of а 
expectations has been almost completely neglected. We may hope ш 
one result of this study of interviewer effect will be to shift some of 
emphasis in training to those sources and loci of error which this studY 
has shown to be of hitherto unsuspected importance. 

Every one of the interviewing manuals examined has included ad- 
monitions to the interviewer to ask questions using the exact wording 
of the questionnaire and in the exact sequence in which the questions 
appear on the questionnaire, and every one of them has cautioned thé 
interviewer to avoid influencing the answer of the respondent either 
actual suggestion of answers or by conscious or unconscious verba 
emphasis or mannerisms, and to refrain from expressing his о 
opinions, even when asked to do so by the respondent. But with the 
exception of the NORC manual, most of them have scant material 0° 


perceptual and cognitive factors suc 
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the biases which may arise in the recording process. None of them that 
We have seen makes any mention of possible biases arising from inter- 
Viewer expectations, including the NORC interview manual, which is 
the most voluminous and has twenty-five separate references to biasing 
factors, including even a warning concerning biases arising from dif- 
ferences in race, economic class, or sex between interviewer and re- 
spondent. 

Curiously enough, one manual contains an admonition which would 
seem to encourage the introduction of bias through the employment of 
attitude-structure expectations. We quote: “Should the respondent 
change his opinion during the course of an interview, you must check 
Over the questionnaire from the beginning and make sure all answers 
are consistent.” And again: “Make sure all answers are properly co- 
ordinated and provide a complete story.” 

i his insistence on consistency seems to requi 
reject any answer not in accord with his expectations ba: 
ütudes revealed by answers to the earlier questions! 

ы кон» it should be stated that agencies have miade ал не етей 
Continuous efforts to eliminate or reduce bias in interviewing by 1n 
tensive instruction and training, by means of manuals, specifications 
9r particular surveys, and by continuing supervision and inspection of 
the interviewer's work. Every effort is made to enforce uniform 
Practices in interviewing so that the results will at least be comparable. 
жна бее of supervision exercised varies паси оп e as 
and the size of staff of the particular agency. Some O° | S 
agencies have regional supervisors who are in at least occasional cous 
tact with the interviewers. NORC training and supervision procedures 
аге described ac length in an appendix to this report. Each interviewer 5 
iw is rated regularly, and upon the completion of each — 
oe Paid receives a personal letter from Sig eae ín ne 
Mex 1 errors of procedure, in so far as they can n wei pa E. 
am | оп of the completed schedules, аге pointed ou d à E t 

Pe, marked or unusual patterns in the responses, tlie xepedttion 
Particular words or phrases in free-answer replies, indications that sug- 
S'Stive probes have been used, deviant behavior as revealed by com- 
ments on the interviewer's report form, and the like faults are noted 


and ^ t 
„Called to the attention of the interviewers. 


ies, This intensive training 
is q lar procedures are used by other agencies g 


esi Г е homogenei 
whi "IBned not only to reduce error but to produc geneity, 
lich j 1, as we shall have occasion to 

, 


re that the interviewer 
sed on the at- 
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When the interviewer is first hired, he receives individual training 10 
NORC techniques and procedures under the personal direction of an 
office or regional supervisor. This training includes study of the 
manual, basic instructions, and trial interviews, which are observed and 
criticized by the supervisor. During the course of this training, the 
supervisor will point out weaknesses and biasing tendencies in the inter- 
viewer's work. Applicants with obviously biasing personal character- 
istics are never hired, and the new interviewer is indoctrinated early in 
his training with such precepts as “Never suggest an answer,” “Ask al 
questions exactly as worded,” “Never show surprise at a person s an- 
swer,” “Never reveal your own opinions,” etc. The interviewer manua 
devotes particular and detailed attention to the subjects of field ratings 
and probing behavior—two of the areas in which studies have fou? 
greatest evidence of bias. The specifications for each survey point au 
the areas in which bias is most likely to occur on the survey. 


Improvement in Personnel Policies and Working Conditions 


To one familiar with the status of present-day interviewing and the 
conditions under which interviewers work, there must appear to be @ 
certain futility in elaborate research to find methods of selecting m 
best interviewers, without at the same time finding ways to make pm 
viewing work sufficiently attractive to appeal to such hypothenct t 
superior personnel. Lists of the qualifications required for good n 
viewers have been made to sound like a catalog of all the virtues * 
high degree of intelligence, pleasing personality, carefulness, dependa- 
bility, honesty, good physical condition, good education, and many 
others. But what does the research agency offer for this paragon’ 
Work which is physically and mentally demanding, low pay; sporadic 
assignments given with little advance notice, and no opportunity " 
advancement. Present average pay rates for interviewing work ru? z 
low as $1.00 per hour, compared with the average rates of 70-75 oan 
common ten years ago. Although we sometimes see interviewing char 
acterized as "professional" work, such pay rates could hardly be a 
pected to attract persons with professional qualifications, certainly P 
for full-time work. 

But interviewing, as market and opinion research is currently = 
ganized, is not full-time work. The frequency and size of assign’ 
varies somewhat from one agency to another, but the range is pr oba ij 
from about eight to twenty assignments per year, of a few hours а 
four ог буе days in length. Hence, most of the agencies rely on hous 
wives and others who do not have to work full-time for a living» W 
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may be able to use a little pin-money, or who accept the work because 
it relieves the tedium of household duties. For the compensation re- 
ceived, it seems that they produce a high caliber of work! Thirty-eight 
per cent of NORC interviewers, in reply to a mail questionnaire 
thought they would continue to do NORÓ interviewing even if paid 
only 75¢ an hour, and only 29 per cent thought they would be better 
interviewers if paid $1.50 an hour. However, it may very well be true 
that if interviewers were employed on a full-time basis and given more 
of a professional status and higher rates of pay improvement in results 
would be obtained. Opinion survey agencies in particular, because of 
the presumed effect of their findings in the determination of public 
policy, have a responsibility to increase the reliability of these findings. 
And a mere statement of the undoubted difficulties in the way of em- 
ployment of full-time interviewers at higher rates of pay does not dis- 
charge this responsibility. If current limitations imposed by financial 
ang operating conditions are accepted as fixed and unalterable, it is 
doubtful if any thoroughgoing improvement in interviewing standards 
can be achieved. 

Improvement in the conditions of interviewing work might not only 
attract a superior type of interviewer but might also bring about a re- 
duction in turnover of the better interviewers. As matters now stand, 
many of the better interviewers leave after a short period to take better- 
Paying jobs. Of all NORC interviewers hired over a period of years, 
only one in five remained as long as two years OF completed as many as 
twenty assignments. The NORC experience is fairly typical of most 
research organizations. In contrast, of interviewers hired by the Bureau 
of Agricultural Economics during four war years, almost half remained 
two years or more, BAE interviewers were employed full-time, had 
Professional status, and received considerably higher-than-average рау. 
d 15 comparison implies that interviewer turnover would be greatly re- 

"ced if the job could be made to offer greater security, more regu- 
arity, higher pay, and higher status. | . 

On the other hand, as long as interviewing remains an occasional or 
Part-time job at low pay, turnover in the staff will be minimized by 
d persons who are not in the full-time labor market and who will 
теге not be attracted by other jobs. Under present ч рус te 

quency and size of assignments and the type of work determine 9 


"Mii Completely the type of interviewer hired. The cities and counties 
required are specified by the 


]d department is restricted in 
or to increase the fre- 


in wh; 
me hi ch the services of interviewers arè 
its Pling requirements, and hence the field 

ability to act on independent applications, 
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quency of assignments. If interviewing were to be made a full-time jos 
research agencies would probably not only have to pool their ris 
viewing staffs (a practice already followed to some extent) but mig 4 
also be forced to use the same national samples of primary areas. ee 
higher rates of pay for interviewing would mean drastic changes in t ; 
economics of the industry. It is unlikely that such changes will com 
about without great pressure from outside. 


2. CONTROL OF ERRORS ARISING FROM RESPONDENT REACTIONS 


In Chapter IV, it was pointed out that certain respondent € 
arise from the interpersonal nature of the interview situation ind 
dependently of the particular interviewer. Reduction of the error yu 
this source can be effected therefore only through modification of t 
interview situation, as discussed in the next section. — 

Bias arising from the group membership disparities between in ж 
viewers and respondents has long been recognized by research agencies 


М а А ; re- 
which have modified certain practices to control error, As Sheatsley 
marks: 


It has become more and more unlikely that any research agency m 
except for experimental purposes, would use white interviewers to S? 
the opinions of 4 cross-section of Negroes, would hire “Jewish-looking 
interviewers to conduct a poll on the subject of anti-Semitism or WO 


» the 
employ a crew of upper class clubwomen to carry out a survcy on 
attitudes of the slum dwellers, 


But aside from such precautions in special cases where it is clear Шр 
the group membership disparity could seriously affect the results, 
disparities continue to exist A$ a potential source of bias. In his stud) р 
the composition of existing field staffs, Sheatsley shows that ed 
viewers are of а considerably higher education and gocio-econti g 
status than the general population. “The ‘typical’ interviewer, in P 
is an upper-middle class woman, about 40 years old, with at least ° 
Or two years of college." ` 

The Katz study, referred to in Chapter IV, provided evidence e 
the use of middle-class interviewers to interview the еен ag 
population tends to distort results in the direction of conservatism. at- 
lection of respondents under quota sampling, as has been shown ES 
edly, tends to produce an under-representation of lower-income e 
lower-education groups, and such an under-representation also urat 
results in the Conservative direction." This compounded bias m a 
lower-class Opinion is probably the largest and most systematic © 


hat 
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biases Operating in opinion survey work and is probably responsible for 
the Republican bias in the results of many of past election polls. More 
Serious in its effects would be the continual pro-conservative bias in the 
studies of opinion on important public issues in the interim between 
elections, 

„ What can survey agencies do to minimize such biases? An approach 
Involving matching or dovetailing characteristics of interviewer and 
respondent is severely limited by labor market and administrative con- 
ditions, First of all, the existing composition of interviewer staffs is de- 
termined largely by the nature of the work—the fact that interviewing 
15a white-collar part-time job with a low hourly pay rate means neces- 
sarily that most interviewers will be people who do not have primary 
Tesponsibility for a family and will be drawn predominantly from 
among middle-class housewives. Hence, apart from such experiments 
as Katz made, the economics of survey work exclude most working- 
class People from interviewing. So under existing conditions, the gen- 
eral composition of interviewing staffs cannot be greatly altered. And 
even for special types of surveys in which group disparities might be 
Considered ag particularly great potential sources of bias, operating con- 
ditions impose severe limitations on any approach to minimizing biases 
through matching characteristics of interviewer and respondent. 

heatsley states the problem clearly: 


«Although most research agencies handle а wide variety of studies, the 
"position of their field staffs can be modified in only very minor ways. 


A and large, the same interviewers must be used for a types of 
ч ‘es because they have been trained for our work, at consi a ex- 
Pense, and because it would not be possible to recruit and train a different 


nationwide field staff for cach particular type of study we conduct. 


Further More, most market and opinion surveys are national cross-sec- 
Чопа] Studies, so that cach interviewer must interview a representative 
te IDle of all types of people in his own town. Even if it were feasible 
sure Ploy many different interviewers in the same е is no 
Means of “matching” interviewer and respondent in advance. 

acl .Wever, some of the survey agencies have made some ne ч to 
tea à partial “matching” by trying to make the field staff a -r 
ó © sample of the population being studied—usually a nationa cross 

cuon With respect to certain characteristics, 6.5 by hiring approxi- 


R. ately: €qual numbers of men and women or proportionate numbers of 
: SPublicans and Democrats, on the theory that biases will cancel out, 
| EHE application of Mosteller’s expedient of equal шен lap pro- 
Con-interviewers to be discussed later on. Agencies which main- 


с 
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tain large field staffs, such as AIPO, tend to emphasize this solution, 
since greater flexibility of the large staff enables the agency to select its 
interviewers to fit the study. Such attempts have not been completely 
successful, and in any case, do not greatly affect potential reactional 
biases, since they are directed mainly toward minimizing ideological 
bias of the interviewer rather than differential respondent reaction to 
the interviewer. 

Smaller agencies cannot use this approach and hence rely largely on 
training methods to avoid bias. It is possible for these agencies to exer- 
cise closer supervision over their smaller staffs and to train each inter- 
viewer in talking to all kinds of people. No matter how intensive the 
training in correct interviewing procedures may be, however, it can- 


not eliminate biases from respondent reactions to the appearance of the 
interviewer himself. 


3. CONTROL OF ERROR THROUGH MODIFICATION OF THE SITUATION 


Perhaps the most practical approach to the reduction of interviewer 
effect lies in greater control over or modification of the situational 
factors which mediate effects, The discussion in Chapter V points out 
that the psychological processes and tendencies in interviewer and re- 
spondent which lead to bias remain latent until the conditions of the 
interview situation permit their manifestation. Where the effects 
manifested by an interviewer are consistent, they are caused mainly by 
personal factors, and the approach of better interviewer selection and 
training would be most fruitful. But where effects are inconsistent, 
situational factors are chiefly responsible, and our aim should be to 
modify these conditions in so far as possible to render them less favor- 
able to the realization of the latent biasing tendencies. 

Implicit in the standardization of instructions and interview pro- 
cedure, which is common practice in survey work, is the continuing 
effort to minimize interviewer effect by control over the situational 
conditions and over the interviewer's behavior in response to these con- 
ditions. But as our study has shown, this control has not always been 
effective against situational stresses, 

Some aspects of the interview situation which may lead to bias are 
not manipulable as we pointed out in Chapter V. Aside from the diffi- 
culty of controlling the personal factors or psychological propensities 
within the interviewer which lead to bias in certain situations, the re- 
spondent himself cannot be controlled, and the broader objectives of 
the survey may conflict with the effort to modify biases inherent in the 
Situation, e.g., we may have to ask a series of questions on interrelated 
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attitudes ev В 
operation hes ш such a series may dispose toward maximum 
Were mentioned in imu аы ae GRE effects. Other limitations 
tent that they reduce m V . Controls must not be applied to the ex- 
spondent’s feeling of e: JDtervIewers ability to use his skills or the re- 
effects of situational i ia.the interview. Nevertheless, the theory of 
implications for cfe: elaborated in chapter V contains many 
Interviewer effects Ti ying the situation so as to eliminate or reduce 
other consideration =: must weigh these potential gains against 
Tesearch pr abiems чи make decisions most appropriate to his own 
ture in proce ae hus, for example, the evidence that lack of struc- 
conclusion that re a major source of error would normally lead to the 
€ avoided. Howe = of field ratings and open-ended questions should 
tating the use of у к there may well be overriding considerations dic- 
Use potentially d suc procedures. Under such conditions of a need to 
тог through Ж i анн procedures, one must seek the control of er- 
ing and к. г other means suggested. One would then seek by train- 
staff Which nn ce appropriate administrative policies to produce a 

Ithough undertake such procedures with impunity. 
Induce sonze “ee mere presence of the interviewer 15 often sufficient to 
Of the intervi ias, effects will increase in the degree that the personality 
€ available ee eure the situation as a focus for the respondent. 
Cording to the e hniques for collecting information may be scaled ac- 
in this manner оар to which they “socially involve” the respondent 
rom minimum to maximum involvement. 


1. $ 
‘+ Self-admini s . Р ; ; 
naires or E questionnaires, which may be mail question- 
-enumeration schedules picked up by the interviewer. 
y the interviewer, but 


2. 
аи сс ballots, handed to the respondent b 
3. The АСА interviewer's presence. 
Westionnai eliberative” technique, by whic 
Conduct т ге far the respondent to “think 
4. The € interview. 
he tes р т interview of the usual type. 
ects unifor ited in Chapter V do not conclusively demonstrate that ef- 
responde mly increase with the presumed increase in opportunity for 
It Wag nt reaction from the first to the fourth of these techniques, and 
Pointed out that respondent reaction to perceived group mem- 


ез 
inge, Ша function partly independently of verbalization by the 
the ver. However, where the respondent's prestige is involved in 


апуу 
е А : H 
Song]... to the question, or where the questions are of a highly per- 

ither interviewer or Te- 


natu 3 ч 
SPondent re or otherwise embarrassing to © 
, there is some evidence that effects will tend to be greater as 


h the interviewer leaves the 
about” and returns later to 
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the technique employed increases the ratio of “social involvement” to 
“total involvement.” For questions of this type, research agencies might 
consider more frequent employment of the less socially involving tech- 
niques, or at least a combination of techniques, with the usual type of 
personal interview reserved for those questions which experience has 
shown are less productive of bias, unless other gains to be derived 
through the agency of the interviewer are paramount. Where these 
other gains dictate the use of the personal interview, variations within 
the interview should be attempted of such a nature as to alter the 
respondent's perception of the saliency of the interactional process. 
One such modification involving interview techniques by which the 
interviewer asks the questions but does not record the answers in the 
respondent’s presence has been used in the past, on the theory that the 
respondent may feel more at ease and talk more freely than when paper 
and pencil are used in his presence. Under one method, the “recon- 
structed” interview, the interviewer fills out his schedule after he leaves 
the respondent. This procedure, of course, places a severe strain on the 
interviewer's memory. It seems that possible reductions in bias through 
better rapport would be offset by increased opportunity for the in- 
terviewer's biasing tendencies to come into play as a substitute for his 
imperfect recollection of the respondent’s answers. Particularly atti- 
tude-structure expectations might influence recording, because the 
interviewer would probably recall at least the general attitude of the 
respondent and might use it as a clue to the answers imperfectly re- 
called. Payne reports errors in one-fourth of the cases when the “re- 
constructed” schedule was compared with tape recordings of the same 
interview, though many of the errors were trivial.” Probably this is a 
conservative measure of the reconstruction error that would normally 
occur, since the interviewers in this case knew they were being 
checked. Another example of error in the “reconstructed interview” is 
given in an experimental investigation of the counseling interview cited 
in Chapter 1. The completeness and accuracy of the reports were de- 
termined by comparing them with phonographic recordings of the cor- 
responding interviews. The reports were written immediately after the 
interviews and the counselors were aware that the interviews were ге- 
corded. Most of the material actually reported was accurate (75-95 per 
cent), but over 70 per cent of the interview material was omitted. Some 
of the omissions were important, so that, according to the author, the 
Teports "gave a somewhat distorted picture of the contents of the 
original interview" and were a poor substitute for the typewritten 
transcription of the phonographic recording. 
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Bevis describes a survey of gasoline station attendants in which tape 
recordings were used to take down the respondent’s exact words 
through the device of concealing a microphone and recording ap- 
paratus in the interviewer's саг.“ Employment of tape recorders would, 
if unknown to the respondent, not only increase his feeling of ease but 
would eliminate all recording bias, as well as provide a check on bias in 
-—- questions and in probing. However, besides the technical diffi- 
culties of using and concealing bulky apparatus in a home interview, 
the method seems highly objectionable on grounds of ethics and public 
relations; The secret would “out” sooner or later, and public reaction 
against the polls might be disastrous, since such records could con- 
ceivably be used to the respondent's disadvantage by a third party. 

Mechanical demands upon the interviewer may result in pressure so 
— to demoralize him, causing him to cheat or distort the data, 

tously or unconsciously, to comply with the mechanical require- 
кү of the task. Psychological difficulties for the pie may 
ice Tom requirements of the survey, which lead to respondent re- 
went, embarrassment ог apathy, or simply from general respondent 
the oc Again distortion and cheating behavior may ac à in 
ation ict between the demands of the job and those o esl re- 
cially E With the respondent, the latter may take precedence, su 
1 Ince maintenance of good rapport may be necessary to get the 
Job done at all, 

F 
меан , these difficulties are Беу 
fos he гт, However, in so far as they : 
Pecific uld be modified so far as possibl 

Scone ев of procedure which should ан tdt wes ely 
to prod nt and form of questions. Types of questio 8 end 

uce psychological difficulties for the interviewer or uniav 


Teactj Ё е i 
ons in the respondent should be avoided as much as possible or 


"Pecial n А А 
i у. cli iti sychological difficulties 
Involved hniques employed to mitigate the psy g 


ow 


nated F 
ү sgu i ject 
Sential to quently they may be essential obj 


ond the control of the survey 
stem from survey procedures, 
e to avoid such difficulties. 
be carefully considered are 


or 


I] such questions cannot be elimi- 
ives of the survey or es- 
er, it may be possible to 


Esse А 

« ЗП their biasing ас in other ways: (1) By use of the less 

Socially ; ng possibilities in o ) , 
Y involving” data collecting technique. Income questions might, 

jl z en where the rest of 


T exa ; 
th mple, be obtained via the secret ballot, ev 


er t © question sequence on the 5С 
Ypes likely to arouse resentment, € 
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should not be placed at the beginning of the interview, where they pui 
destroy rapport at the outset, unless the survey purpose makes thi 
order mandatory, as, for example, when necessary to determine whom 
to interview. (3) By greater attention to simplification of wording.’ 

In some cases, attitude-structure expectation effects might be = 
mized by embedding the significant attitude questions in a context О 
questions which have no presumptive attitudinal relation to each rig 
or by placing related questions as far apart as possible to prevent t 
carry-over in the interviewer's mind. Р d 

The situational pressures which bring into play certain biasing ten 
encies as an aid in coping with the difficulties of the interviewing oe 
are attenuated by experience. The experienced interviewer has ha 
practice in learning how to overcome many of the difficulties that arise 
in interviewing, and hence he is less hostile to such difficulties, is able to 
maintain a more detached or professional attitude in cases where the i 
experienced interviewer might try to find a way out of his troubles , 
the conscious or unconscious employment of his own preconceptions p 
expectations. Thus the implications of Chapter V for the modificatio 
or control of the situation to minimize bias are most relevant when in- 
experienced interviewers have to be employed. 


4. CONTROL THROU 


The empirical approaches to the control of interviewer effect which 
we have discussed so far are concerned with control of error at me 
source, through better selection and training of interviewers, match ing 
interviewer and respondent characteristics, and elimination of in 
Another approach simply attempts to produce greate 

Zero net effects in the behavior of interviewers 


GH CANCELLATION OF EFFECTS 


in the field, 
Can 


arising from te 
favorable to his own 


i 7 : ; to 
"ignments are interpenetrating, the effect will be confined chiefly 
the minimizin 


: : of 
і 8 of ideological sources of bias affecting the RS, 
no bearing on biases arising from ot 
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Sources, such as expectation, class differences, or question wording. 
F urthermore, there are a number of practical difficulties in applying 
this expedient. The labor market and operating conditions involved in 
hiring and maintaining an interviewing staff do not permit the con- 
nual juggling that would be necessary to insure an equal number of 
Pro- and con-interviewers on every issue. Even in a single survey, 
usually a number of different issues are involved, so that it would be 
Ipossible to obtain an equal division of opinion on all of them. How- 
ever, the principle might profitably be applied in situations which ex- 
Petience has shown to be most productive of ideological bias, or where 
recurring surveys of the same or similar type are undertaken. For ex- 
ample, opinion research agencies engaged in pre-election polls and in 
Studying other issues highly correlated with political party affiliation 
might, Оп this principle, maintain approximately equal numbers of Re- 
Publican ang Democratic interviewers on the staff, as some of them 
try to do. But since labor market conditions and the nature of inter- 
u- ving work bring about a high degree of homogeneity of interview- 
ers Characteristics, equal distribution of opinions on most issues would 
Seem to be difficult to obtain. ; 
€ refer the reader to the original source for Mosteller’s detailed 
ulation of the problem. To summarize the argument here, note 


that f 
the net bias may be stated as: 


form 


Net bias equals ) | 
Pro-bias per pro-interviewer times per cent pro-interviewers 
minus 
Con-bi , е s 
On-bias per con-interviewer times per ce 
Now, ; Р — " 
respo V, if the tendencies of pro-interviewers to get too ect pro: 
int nses are, on the average, equal in strength to the tendencies of con- 
tion Wers to get too many con-responses, it is clear from the equa- 
Ron pore that the opposing biases will cancel if—and only if—the 
inter of pro- and con-interviewers are equal. For every Republican 
e UE Who obtains, say, 5 per cent too many pro-Republican an- 
S, there will be a Democratic interviewer who gets 5 per cent too 
e У Pro-Democratic answers. In most practical situations, there will 
по bac; . in most | 
tical asis for assuming a differential biasing tendency, so that on prac- 
is 2.0914 equalization of the number of pro- and con-interviewers 
Mdicateg 
w T" А з ; 
eg C âre unwilling to make the assumption that the opposing biases 
qual in strength, Mosteller argues that an equal distribution of pro- 


nt con-interviewers. 


are 
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and con-interviewers is preferable on the grounds of symmetry—that 
is, the possible biases, under all possible assumptions, would distribute 
themselves symmetrically about zero if—and only if—we have an equal 
distribution of interviewers. For example, suppose that there is a 10 per 
cent difference in results between pro- and con-interviewers. If we 
make in turn the extreme assumptions that all of this bias were attribut- 
able to the pro-interviewers or to the con-interviewers, the maximum 
possible biases are plus and minus 5 per cent if the interviewers are 
equally divided. If the interviewers are not equally divided, one of the 
limits will be above 5 per cent in one direction, the other below 5 per 
cent in the other direction. Suppose there are 70 per cent pro-inter- 
viewers and 30 per cent con-interviewers. "Then, under all possible as- 
sumptions, the biases range from plus 7 per cent to minus 3 per cent, $0 
that the maximum possible net bias is greater in absolute magnitude 
than for the case with an equal distribution of interviewers. 

It can also be shown that the average absolute distortion over all pos- 
sible (not probable) divisions of the total bias is smaller for the equal- 
distribution case. Since the possible biases in this case range from -$ 
per cent to +5 per cent, the average absolute bias (without regard to 
sign) is 2% per cent. Consider again the case where 70 per cent of the 
interviewers are pro, 30 per cent con, with possible biases ranging from 
—3 to 4-7 per cent. The possible biases from —3 to —5 per cent are re- 
placed (as compared with the equal distribution case) by possible biases 
ranging from +5 to +7 per cent, a difference of 2 per cent absolute on 
the average over one-fifth of the range. Hence the average possible bias 
in this case is 0.4 per cent greater than in the equal distribution case." 

In case the interviewers are not equally divided on an issue but an 
estimate of the total bias is available, the assumption of equal biasing 
tendencies could be used to correct the results, providing we can be 
sure that pro- and con-interviewers were assigned equivalent samples. 
Suppose that the interviewing staff consists of 60 per cent Republicans 
and 40 per cent Democrats, and that the Republican interviewers ob- 
tain 57 per cent pro-Republican responses as against 47 per cent for the 
Democratic interviewers. The unadjusted estimate of the pro-Republi- 
cans in the population is 53 per cent (60 per cent х 57 per cent + 40 
per cent X 47 per cent). The biases are not self-canceling, since we do 
not have an equal distribution of interviewers. To correct for this, We 
might assume that the 10 per cent difference is composed of a 5 per 
cent pro-Republican bias for the Republican interviewers, and a 5 
per cent pro-Democratic bias for the Democratic interviewers, In 
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other words, we assume that both groups should have obtained 52 per 
Cent pro-Republican responses, so that this would be the corrected 
estimate. Clearly, however, adjustments of this kind would be risky un- 
less extensive experience had shown them to be reliable. 
аи У, however, provided a demonstration that biases in op- 
heut C Wi do not necessarily cancel each other. There it was 
that in a particular case of bias connected with omission of an 
alternative, majority interviewers exercised their bias by inflating the 
Category which they themselves would have selected, while the bias of 
Minority interviewers usually took the form of inflation of the "Don't 
know” category. In this case, at least, the result is a systematic net bias 
ory ЕКЕ 
the: Méesesflen (3 out how biases operate, 
er antril formula seems unwise. 
, In the unlikely case that we have actual information about the rela- 
s е strength of the opposing biases, the number of pro- and con-inter- 
ewers assigned should be in inverse relation to the biases. If, for 
ei we have a total of thirty interviewers, and we know that pro- 
, CI Vie Wers exert a 10 per cent bias, con-interviewers a 5 per cent con- 
as, then ten of the interviewers should be favorable on the issue, while 
ett should be opposed. The total bias in each direction will then be 
ae а, since the greater strength of the pro-bias 15 offset by a propor- 
nately smaller number of interviewers exercising this bias. 
ince the Mosteller procedure deals only with marginals, some other 
таан would be desirable to minimize interviewer effect for subgroup 
aracteristics and for comparisons between subgroups. In fact, as we 
gs ted out in Chapter VI, in public opinion research Ре 
" П interest of the analysis is not 50 much in margina totals as in cer- 
: 'n functional relations, as, for example, comparisons between classes 
ше Population. We сап often tolerate considerable error in the 
marginals, provided these functional relations are relatively free from 
Istortion, 
ín One device that may be effective in minimizing such distortion is the 
© of interpenetrating samples. In the first place, the use of inter- 
wi Dating samples gives assurance that no single cu ae. 
M йаз unduly influenced by the idiosyncrasies of one or a few in 
lewers, For example, if we are studying the attitudes of various classes 
Оп some public issue, the ideal distribution of assignments would be to 
Blve each interviewer an equal random sample of the cases within each 
“lass. If a single interviewer tended to bias results for some particular 
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class of respondents, the distortion introduced into the results for the 
class by this interviewer would be attenuated by the data obtained by 
the other interviewers. More important, the bias in comparisons be- 
tween subgroups will be minimized. Even though the biases for the dif- 
ferent subgroups tend to be fairly constant where a large number of 
interviewers are employed, a high degree of clustering of assignments 
is likely to result in distortion of subgroup comparisons because of 
interviewer variability and also because of interaction between inter- 
viewers and classes (certain interviewers may bias results particularly 
for certain classes). Use of interpenetrating samples will tend to insure 
the constancy of biases over the different subgroups so that no distor- 
tion or very small distortion in the comparisons between classes will 
occur. 

Interpenetrating samples have also often been used for experimental 
purposes in the control of error, particularly for measurement of inter- 
viewer or sampling variability. Their most extensive use for this pur- 
pose has been in the experimental work of Mahalanobis in India, dis- 
cussed later on.“ 

Financial and operating considerations usually dictate a considerable 
degree of clustering of assignments. However, the repeated evidence 
from experimental studies of interviewer effect that bias tends to con- 
centrate among a few aberrant interviewers suggests the desirability of 
employing this principle of spreading risk as much as possible. 

Methods of error control may be directed toward ironing out the 
variability between interviewers, as, for example, training methods 
which may at least produce homogeneous standards within the inter- 
viewing staff, although they may also leave some constant error. Like 
interpenetrating samples, reduction of interviewer variability brought 
about by the uniformizing effect of training will have the useful effect 
of reducing the error in subgroup comparisons. Such a reduction woul 
occur when whatever bias produced by, or remaining after, the homog- 
enizing effect of training was in the same direction for both subgroups 
being compared, which seems fairly probable. As an example, suppose 
Interviewer A’s respondents are largely middle- and upper-class, while 
Interviewer B’s respondents are lower in the social scale. On some 
Opinion questions, more intensive probing might tend to push the 
Majority of the responses which were initially “DK’s” into the “yes! 
column, If Interviewer A probes more frequently and intensively that 
Interviewer B, his higher-class respondents will show a higher propor- 
tion "yes" merely because of the difference in probing behavior. 
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training methods succeeded in producing greater uniformity in the 
probing behavior of A and B, differences arising from the different 
DK” rate would be reduced. 
_ It is conceivable, though, that homogeneity might increase the error 
in cross-tabulation. This would be true if the result of training was to 
Produce greater bias for one subgroup than another, or biases in differ- 
ent directions for two subgroups. This might even occur as a result of 
à procedure designed to reduce bias in the marginals, if, for example, 
the procedure could be applied more easily to some classes of re- 
spondents than others, but such an effect of homogeneity would seem 
unlikely, 
А classic example in the use of training methods to produce uni- 
Rook in personnel interviewing was presented by L. J. O'Rourke of 
€ Civil Service Commission in 1929." The qualifications of 4,000 ap- 
Plicants for positions as prohibition officers had to be evaluated by 
oy Or al examiners. A set of hypothetical, but realistic problems, con- 
*rned with the investigation of reported prohibition law violations, 
Was constructed to test the judgment, resourcefulness, and skill of the 
*pplicants. The problem was presented to applicants by the examiner 
9? Interviewer in a uniform manner; the possible questions the applicant 
i ask the interviewer were anticipated and worked ws in me 
Wie i. PE list of answers or statements was available ы иа ud 
plic T's use in replying to each of the possible questions. ve i, P 
ant was asked to tell how he would go about investigating the case. 
п СА Very procedure which the applicant x ene kr 
Probes = ted the interviewer, and for eac Lum : ан ibis 
ollow-up questions was listed, so that t | 
Prepared with a logical and uniform method of probing that suggestion. 
oon of numerical values was preassigned to the anticipated answers, 
"s Sstions, and suggestions of the applicant, and the pun RAC 
ug ied With a table of these values applicable S 5 pro etim 16 
made? сата. On this basis, objective ratings 0 the app 


i € ing which the 
Examiners were given an intensive training course, during 


"aire al examinations 
ith се 99р of thi i witnessed the same Or Я 
With р of thirty trainees е “applicants,” «ad each 


, vOmmissi ing the Г А 
trainee on employees playing of four possible ratings, say 


: he train- 
ju C, or D. The first three "applicants ated before the tra 


n went 
ign oan began. Comparison of the distribution 


ngs f : 
Or these three with their ratings for the <8 
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twenty-second “applicants,” given in Table 81, below, shows how the 
training course tended to increase uniformity in the ratings: 


TABLE 81 


Increase IN UNIFORMITY IN RATING оғ APPLICANTS as A RESULT оғ TRAINING" 


Berore TRAINING ArTER TRAINING 
Ratine Applicant Number Applicant Number 
8 | 5 | 2 
1 = = 
9 27 = 
20 3 3 
= — 27 
Percentage in largest rating group....... 47 47 43 67 90 90 


* The numbers given in the table are approximate, having been inferred from the original graphic distribution. 


Although the training of civil service examiners provides an extreme 
case of standardization, it is possible that this approach might be more 
extensively used in certain types of recurring opinion surveys, where 
most of the possible answers of respondents, both direct and equivocal, 
might be anticipated and probes worked out in advance for the guid- 
ance of the interviewer. To a limited extent, such a procedure is fol- 
lowed now by opinion research agencies in their instruction manuals, 
but the recommendations given in these manuals usually apply to gen- 
eral situations encountered in many surveys, rather than to a specific 
survey, though the procedure is used to some extent in the specifica- 
tions or instructions for individual surve s. 

However, training and other methods of handling interviewers (se- 
lection, dismissal, contacts, etc.) may not only produce homogeneity 
but also diminish error,‘ Occasional checks for bias may be instituted 
in nonexperimental surveys through the use of supplementary ques- 
tions, minor modifications in survey design, or in assignment of sample 
Cases to interviewers, which will enable the survey agency to single out 
the most defective interviewers most prone to bias, and either intensive 
retraining or dismissal of the aberrant interviewers may be effective in 
teducing bias. These, together possibly with infrequent specially de- 
signed studies, could be used to supplement the usual ratings of inter- 
viewer performance as a guide in handling dismissals. The evidence al- 
ready given for generally superior performance of experienced inter- 
viewers seems to show that present training and dismissal practices do 


ем to weed out the poor interviewers and thus reduce interviewer 
ias. 
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. Most studies of interviewer effect, however, have not been so de- 
Signed as to yield evidence on which interviewers were biasing results. 
Conceivably erroneous judgments as to which interviewers are superior 
could eliminate interviewer variability by eliminating the deviant inter- 
Viewers while giving results of complete invalidity, because a homoge- 
neously bad staff had been selected. Sometimes internal evidence will 
furnish a clue to the relative validity of the results. Occasionally, in- 
dependent checks may be available, as in the NORC Denver Com- 
munity Survey, in which official records of the characteristics of each 
respondent gave an opportunity to measure the relative validity of the 
Tesults obtained by the different interviewers. 


5. CONTROL THROUGH FORMAL OR MATHEMATICAL METHODS 


». е approaches to reduction of interviewer error discussed thus far 
© all been concerned with manipulation of the factors responsible 
rd осор Another approach involves estimation of the magnitude of 
ed Such estimates are of considerable value in the analysis and inter- 
dian of the data, and they are useful in determining how the error 
ing from the interview process may be minimized in future surveys. 
ces discussion of the advantages of the approach will be 
ed shortly, | 
Viewer dues VI, several different classes of d Los 
all deviati ect were distinguished. Gross тоот ; eade 
respon 1ons of responses recorded by the interview р: тне A 
erence р as defined for the study. Net effects were cpm sah 
intervie etween the distribution of responses obtained Dy 0 des 
i ~ Vers and the “true” distribution of responses for the popula jor 
t effe, Wed. Since errors in opposite directions may cancel pedes 
“ts may be negligible or absent even when a considerable a 


к . se 
ts effect occurs. Also net effect may occur Lad euch iet 
Net е S While canceling out over all interviewers, js y dent ie 
lation s Or bias in the distribution of the responses p í ato ie 
, , l'espondents interviewed by all interview ers. The de + 
'Nterviewer variation is based оп the concept of a potentially 
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distribution of responses if 
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that the measurement of inter-interviewer variability does not provide 
a measure of the constant bias or net sum of biases over all interviewers. 
In fact inter-interviewer variation may be zero even when a large net 
effect exists, if the bias is constant over all interviewers, a condition 
which may sometimes be approximated in practice because of homo- 
geneity produced by training methods or by the composition of the 
interviewing staff. In sum, interviewer variance represents the error 
about the “expected value” for all the interviewers, while net inter- 
viewer bias represents the deviation of this expected value from the 
true population mean. Total interviewer error is the sum of the two 
kinds of errors. 

The condition for the absence of bias is that the response errors of 
different interviewers (deviations from the true values) be compensat- 
ing, while the condition for the absence of inter-interviewer variability 
is zero correlation between the response errors obtained by a single 
interviewer. If the response errors of any interviewer tend to deviate 
in the same direction from the average error for all interviewers, his 
errors will be correlated. Hence, both the presence and absence of inter- 
interviewer variability may occur in conjunction with the presence of 
absence of net bias, depending on the co-existence of the two condi- 
tions. But if inter-interviewer variability is present, it means that at 
least some of the interviewers are introducing distortion, and it is not 
safe to assume that the individual biases will cancel in the aggregate- 

There are a number of ways in which the measurement of interviewer 
error, in the form of measurement of gross or net effects or measure- 
ment of inter-interviewer variability, may contribute to the reduction 
and control of error: 

‚1. By showing whether there is a problem, that is whether inter- 
viewer effect is large enough to be of special concern. 

2. In interpreting survey results, measurements of gross and net ef- 
fects make it possible to take account of the degree of invalidity of the 
data, while measures of inter-interviewer variation as a component 9 
sampling variability enable us to state the degree of reliability of sur vey 
results, 

3. A series of such measurements may localize the interviewer error. 
If it is found that particular questions or particular content areas are 
most productive of effects, attention can be directed toward improving 
survey procedures in such areas, and the survey organization will know 
where to place the emphasis in the training of interviewers. Studies 0 
Inter-interviewer variability as well as studies of gross and net effects 
may serve this purpose, since, as we mentioned before, significant inter- 
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viewer variation indicates that at least some of the interviewers are dis- 
torting the results. However, only studies of gross and net effects can 
reveal the presence of biases which are fairly constant over all inter- 
viewers or show clearly which interviewers are biasing the data. It 
may be that the interviewers whose results show the greatest departure 
from the average are obtaining the more valid data, but if we assume 
that the opposite is true, we may sometimes be able to track down the 
error by spot checks of the schedules for the aberrant interviewers or 
2 reference to a priori considerations or experience. Where the error 
15 successfully localized in particular interviewers, intensive retraining 
or dismissal may be effective in reducing error. 

4. Isolation of the component of sampling error due to interviewer 
Variation may enable us, under certain assumptions, to determine how 
great an increase in the number of interviewers is necessary to bring 
about a desired reduction in interviewer contribution to the sampling 
error, Or to determine the optimum number of interviewers to give 
minimum variance for a fixed cost, or minimum costs for a fixed degree 
of reliability, н 
ч Alter native survey methods may be employed experimentally p 

Samples within a single survey. If one method (such as the use о 
aie, or supposedly superior interviewers) can be с. be 
m геаѕоп to be relatively unbiased, the bias under the less = : 
i can be estimated as the difference between vs a ira k » 
ias ы Then comparison of squid үл id E oe sa 
in oe two methods will enable us to se on : e i dern 
ance) ie which gives the minimum deos Loren 
ina danie Lis a or to combine the tw 

pling design. А А 
геи of differential net pe qu S or 
fectly 8 ideology, expectation, ог g P ae | 
father in the measurement of gross and net effects ee DU 
ор eed not be re-examin 
Benera] 


xclusively to esti- 
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191 of interviewer variation. Since such estimates, а sow di 
Boing, can be useful in the contro rorinan s рй a 
iscuss here the conditions under which it is P 
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mate interviewer variability and methods by which the estimate may be 
accomplished. 

The fundamental conditions for the estimation of interviewer vari- 
ance for any characteristic are that the assignments of different inter- 
viewers must be interpenetrating, that is, they must be equivalent sam- 
ples of the same population, and each assignment must consist of two 
or more sample units. The interviewer subsamples themselves may be 
simple random, stratified random, or systematic samples, and the units 
of sampling may be individual persons, households, or clusters of per- 
sons or households. Under these conditions, the variation among the 
distributions obtained by the different interviewers in the survey would 
be equal, on the average, to the variation among samples of the same 
kind and size taken by any one interviewer, provided there is no inter- 
interviewer variability, that is, provided the effect of different inter- 
viewers on recorded responses is not significantly different. Hence, by 
testing the significance of the ratio of observed variation between in- 
terviewers to the variation between respondents of the same inter- 
viewer, we can determine whether interviewer variability exists. 

. It is, of course, not necessary that all the interviewer subsamples 
Interpenetrate. If assignments are equivalent for pairs of interviewers 
or within groups of interviewers in geographic areas or other sub- 
classes of the population, interviewer variation can be estimated. Such 


an interpenetrating design was the type used in the Denver and Cleve- 
land studies described in Chapter VI. 


_When the condition of equivalence of assignments is not met, inter- 
viewer variation is confounded with locational variability or variability 
between subclasses of the population. In normal survey practice, 4 
considerable degree of clustering of interviewer assignments is usually 
necessary because of the expense and time required for travel between 
Scattered units, In many opinion and market surveys, the population 
under study is the entire country, and in many of the sample places, 
only one interviewer is employed. In many others, the number of 
sample cases and the number of interviewers is very small, necessitating 
clustering to save travel costs. Therefore, it is not ordinarily feasible to 
assign equivalent samples to interviewers, even in sets. Thus under ordi- 
nary Survey conditions, interviewer variability cannot be measured in 
any strict sense. This fact is often glossed over lightly and equivalence 
of interviewer assignments assumed without adequate justification. 
number of instances of this kind in published studies of interviewer 
error were cited in Chapter VI, where the reasonable suggestion was 
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made that interviewer variability has been greatly exaggerated on this 
account, 

Under quota sampling, in particular, interviewer variation in the 
Tesponses obtained cannot be measured in the strict sense, since the 
Probability that a given individual will fall into the sample.or be inter- 
viewed by a given interviewer is indeterminate. Interviewer variation 
Net si is confounded with variation between the different inter- 
„ewer subsamples arising from the latitude allowed the interviewer 
" = selection of respondents. Where block-quota samples are used, as 
di E Пе traditional NORC procedure in the larger cities (see Appen- 
oe freedom is restricted by the predesignation of the oe 
equivalence ne quotas are to be filled. In this н ше siena 
Provided cf of interviewer assignments may not е so greatly i = TOT, 
"über 2 е samples of blocks are equivalent. Maximum Eo o de 
variatio ariation in responses elicited, calculated from the observe 
may pes in the obtained distributions of the different interview ёш, 
Variati metimes be low enough to justify a conclusion that interviewer 

D 15 absent or negligible. | 
жайы, from a practical standpoint, there may = mes ee 
the res nent of the variation even though it cannot be separat 
i Sponse and selection components, when the blocks in different 


inter аш 
Surv, sit subsamples represent equivalent samples of blocks in the 
"СУ area or within subdivisions of the survey arca. The observed 

r of interviewers, 


агас 
варе between interviewers, divided by d and | n 
error (i used to calculate a rough approximation to the ош pipes 
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tion between the results of different interviewers may be known from 
other sources, so that interviewer variation can be separated. This might 
rarely happen in the case of certain factual characteristics which might 
be known for different small geographic areas from a recent census. 
Of course, response errors are present in complete censuses and re- 
sponse bias is probably larger usually, but the contribution of variance 
between interviewers might be small because of the larger number of 
interviewers. In practically all cases, however, no such information 3s 
available. 

In sum, the precise determination of interviewer variance requires 
that the study be specially designed for this purpose. Under ordinary 
survey procedures in the assignment of cases to interviewers, the vari- 
ance between interviewers in small groups or pairs within the same 
geographic small area or in areas presumed to have closely similar char- 
acteristics might be used to approximate interviewer variance. Where 
each interviewer is assigned a single segment or area at random, а 
closer approximation could be obtained by spotting the sample cases 
for each interviewer on a map, subdividing the area covered by the 
interviewer into two or more smaller areas, and taking the variation 
between paired adjacent small subareas of different interviewers as an 
approximation of the true interviewer variance. Such methods would 
usually give overestimates of the variance, but at least they would set 
reasonable upper limits. Perhaps one practical procedure which may be 
used when recurring surveys of the same type are made would be to 
design an occasional survey to measure interviewer variance, and assume 
that this variance will be the same for other surveys of the same суре. 
However, the repeated evidence given in earlier chapters, reinforce 
by some of the data cited later in this chapter, that much of the inter- 
viewer error and bias which occur are situational in character or occu” 
randomly, ог in the form of aberrancies of one or two deviant inter- 
Viewers, counsels caution in imputation of the same variance to latet 
surveys. 

The concept of interviewer variability as formulated here as a form 
of statistical variability implies that its effect on sample estimates wil 
diminish as the number of interviewers increases in the same way that 
sampling error in the usual sense diminishes with the increase in the 
number of units drawn into the sample, that is, in inverse ratio to the 
Square root of the number of interviewers, A little reflection will show 
re E conform to the limitations and demands of 
Бош: ed the number of interviewers and the variation 

s remained the same, then the effect of interviewe" 
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variability on the variance of sample estimates would be halved. Actu- 
ally, in this case, the variation between interviewers would probably 
change because training procedures might have to be altered, possibly 
less time given to intensive training of each interviewer, and because a 
change in the size of assignment given each interviewer would probably 
affect the magnitude of response errors and the correlation of response 
errors within interviewer assignments. For example, with a large as- 
Signment, fatigue or time pressure might increase the tendency of the 
interviewer to cheat or to employ his own expectations or opinions in 
the Interpretation of equivocal responses. In effect, then, any change in 
the number of interviewers results in a different set of survey condi- 
tons, and the strict definition of interviewer variability becomes the 
Variation in the distribution of responses obtained by different inter- 
um when a specified number of interviewers is employed about 
Istribution of responses over all possible samples of this specified 
i еы of interviewers. 
fs practice, moreover, when the number of inter 
Tkedly, the universe from which the additional 
га differs from the universe from which the smaller " 
г of interviewers is drawn. The additional interviewers may be 
ecd experienced, less able, or college students instead of Е 
be 30 on. Hence the variability between interviewers ey е T 
a The effect of interviewer variability un the i po x 
Жы estimates is probably in approximately © erse : SNC 
ceed a of interviewers up to some number which s ani ms hh 
ee € usual number employed, but thereafter the decr P Euh. 
ime smaller, or there may even be an one perdas 7 or 
kr ius may increase if the additional interviewers are үз 
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of respondents for the three interviewers was 1,015. The results ob- 
tained are shown in Table 82. 


TABLE 82 


Variation or Rresurrs or Turre Dirrerent Interviewers on Turee DIFFERENT 
QuzsrioNs 


Ркоғокт1ох or RESPONDENTS GIVING THE 
Sreciriep ANSWER 


Question All 1015 e qe Ga i rater views 
Respondents | (326 Cases) | (346 Cases) | (343 Cases) 
Factual question: Do you know how to 
drive an automobile? (Answer— Я 
pi D MERE. 66.1 66.9 63.3 68. 
Information question: So far as you 
know does State X have any laws that 
limit the size of trucks, etc.? (Answer 6 
—"Yes") 67.4 64.4 65.0 72. 
Opinion question: Do you think bigger 
trucks should be allowed or are they 
big enough now? (Answer—“Big 6 
enongh aw isois ies exw cire then 74.0 73.3 71.1 71. 


The numbers of sample cases for the different interviewers аге ap- 
proximately equal. Taking this average as 340, and assuming for the 
moment that the interviewer subsamples were equivalent random ѕат- 
ples from the same universe, the standard errors of the individual inter- 
viewer percentages on the three questions would be approximately 
2.6, 2.5 and 2.3 per cent respectively. The difference obtained by Inter- 
viewer C on the information question does not seem to be accounte 
for by sampling error. However, on all three questions, the analysis © 
variance was made, breaking up the total mean square into interviewer 
mean square and mean square between respondents within interviewer 
subsamples. On the information question, the interviewer mean squares 
as expected, turned out to be significantly larger than the respondent oF 
sampling mean square, but not on the factual and opinion questions. 
Thus interviewer variability was indicated for the information ques- 
tion, The analysis of variance for this question is shown below: 

The total sampling error including the interviewer contribution w45 
calculated, again on the assumption of equivalence of assignments. For 
this purpose, we can consider the analogy of cluster sampling. The r°- 
Sponses that would be obtained by a single interviewer from all indi- 
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viduals in the population would be a single “cluster” of responses. A 
sample of & of these clusters or & interviewers is selected, and within 
ie cluster a subsample of responses is taken. Thus the variance, o°,, 
idees. s estimate of P, the proportion answering “yes” would be 
a 2 usual variance for cluster sampling. Assuming that the uni- 

spondents and the universe of interviewers are very large, 


TABLE 83 
ANALYSIS OF VARIANCE FOR INFORMATION QUESTION 
Source or VARIATION Sua оғ Squares* Баас Mean Square 
Tor: 
Fae TM — 5 223.0581 1014 
ng interviewers. . 1.4101 2 -7051 = B 
mong respondents (v 
Interviewer) 221.6480 1012 2231902 4 


F-rati Mean s interview! 
ati5 xs a quare among interviewers _ 3 59 which is significant at the 5 per cent 
сап square among respondents — |eyel, 


e from above figure due to round- 
ring the differences between the 
he number of cases. 


‚ Th 

ing р), iet sum of squares = пр = 1015 (67. 62 = 223.02 (diferenci 

Interviewer Interviewer sum of squares can be obtained approximately by squa: 
Percentages and the over-all percentages and weighting each square by t 


and ; В ; А Е 
that interviewers had equal numbers of cases, this variance would 
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Е i esents the average variation between all possible respo 
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The variance of p can also be calculated directly from c^, = 


= .000695,* 

The first term in the variance ( -000479) represents the interviewer con- 
tribution to the sampling variance. The standard error of p is V/.000695 
7.026 or 2.6 per cent. If interviewer variability were not taken into ac- 
count, the standard error of p would be calculated from ср = JH. 
1.5 per cent. The net effect of taking into account interviewer variance 
was to triple the variance and almost double the standard error. . 

The conditions necessary for strict measurement of interviewer vari- 
ability and for calculation of the sampling error were not present in this 
example. Since this was a quota sample, the comparability of inter- 
viewer subsamples cannot be assumed, even if the three interviewers 
were working in the same geographic area. However, the fact that the 
factual and opinion questions did not show significant interviewer 
variation suggests to the authors that it was not differences in sample 
selection which caused the variability on the information question and 
that there was some peculiarity in the way C interviewed as contrasted 
with A and B. 

The explanation is reasonable enough but we could also hypothesize 
ine C might have tended to select respondents of slightly higher edu- 
cation or class on the average, and that it is precisely on information 
questions such as the one in this case, concerned with fairly obscure 
state laws, that respondents of higher education or class might be ex- 
pected to show differences from the average, while the differences be- 
tween classes of respondents would be likely to be negligible on ques- 
Hons like “Do you know how to drive an automobile?” or “Do you 
think bigger trucks should be allowed?” In any case, the attempt to 
measure interviewer variability has at least the merit of suggesting the 
type of question on which variation is most probable. 

Suggestive evidence that such selection variability may often be con- 
founded with variability within the interview is provided by another 
study reported by Stock and Hochstim. In this case, the survey was 
especially designed to test the effect of sample design on interviewer 
has oe meeting Systematic block samples were un 
Elit dr rà Я e y-age quotas were assigned. within the selecte 
алы m e "e er, specified respondents in specified blocks were as- 
were fist grt n ше, Results on questions of six different types 

yzed for the two samples combined. 
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SOR contributions of interviewer variability varied considerably with 

e A м е 

E^ type of question. To measure the effect of sample design, the inter- 
Wer variances were determined separately for the block-quota and 


TABLE 84 


CONTRIBUTIONS оғ Variances To STATISTICAL Error 
Percentage Contribu- 
tion of Interviewer 
Variance to Total 


Type of Question Variance of Estimate 
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the Probability sample. The separate variances are shown in Table 85. 
" n four of the six questions, the estimated interviewer variability is 
Practically negligible for the probability sample, suggesting that inter- 
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interviewer variability measured from a quota sample is likely to reflect 
chiefly variation in the selection of respondents. However, the results 
are not conclusive, since the interviewer variability was higher on two 
questions for the probability sample. | 

The authors state that reassignment of sample blocks among inter- 
viewers resulted in a condition approaching randomness of interviewer 
subsamples, so that the analysis of variance seemed justifiable. The fact 
that most of the interviewer variances for the probability sample were 
very small lends some support to this conclusion. 

The most extensive measurements of interviewer variability under 
rigidly controlled conditions of equivalence of interviewer assignments 
are found in the continuing series of studies using interpenetrating 
samples carried out by the Indian Statistical Institute and reported by 
Mahalanobis.*? Surveys of housing and economic conditions of factory 
workers in the Jagaddal area conducted in 1941, 1942, and 1945 provide 
an example. The survey area was divided into five geographic subareas 
or strata. Within each subarea the sample units were divided into five 
equal subsamples, each of which was an independent random sample of 
the whole subarea. Thus the five subsamples constituted five inde- 
pendent interpenetrating networks of sample units within each subarea. 
Each of the five subsamples in a subarea was assigned to a different 1n- 
terviewer, and the same five interviewers were used in all five arcas. 

With such a design, an analysis of variance of the results could be 
made to show the contribution to total variance of areas, interviewers, 
and area-interviewer interaction. The results of this analysis for 1942 
are shown in Table 86 below. Only three of the five areas were used in 
the analysis, as the numbers of cases in the other two areas were too 
small. The numbers of cases in each of the resulting fifteen arca-cells 


ki equalized by rejecting an appropriate number of schedules at 
random. 


The various components of the variance were compared with vari- 


ance within “area-investigator cells,” that is with the variance between 
respondents in the same area interviewed by the same interviewer, and 
-ratios computed. In the case of age and monthly expenditures for 
Cereals, the interviewer variance was significant. 
From the analysis of variance, estimates of the total sampling error 
could be calculated. We will illustrate by the calculation of sampling 
error for Per capita expenditures on cereals. If B is the mean square be- 


tween Investigators, the variance of the sample estimate will be ap- 
proximately 
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oz = .049 


The esti 1 
The dene А of mean per capita expenditure for cereals was 3.09. 
Sf Ehe as i error of this estimate is approximately .05. The variance 
mple estimate calculated in the usual manner (without taking 


TABLE 86 


Ben 
NGAL Lanor Inquiry, JAGADDAL Area, 1942—ANALYsIS OF VARIANCE 
(Analysis Using Equalized Cell Frequencies) 


Varuzs or VARIANCE FOR FOLLOWING 
CHARACTERISTICS 
Sı De " 
OURCE OF VARIATION c Expenditures in Rupees Consumption 
Freevo] ace Per Month Per Capita SEIN 
Head Per 
Month 
B Total Food Cereals 
etwee ml 
Ке чел, 2 62.13 | 8052 | 365 | 0.07 79.6 
mas Ur нош ж | 30484 | 27578 323 | 125 1147 
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Total MM 
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усе айй... sa 0.40 4.794 | 3.69 | 014 0.80 
teas x 176511511015. 2.39* 164 | 2.26 | 2.55* 1.15 
etween ae stigators.. 0.61 0.77 | 0.98 | 3001 1.53 
Subsamples . k 1.10 1.59 1.70 2.471 1.52 
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standard error of .031. Thus the effect of interviewer variability was to 
increase the sampling error by something over 50 per cent. | 

It will be noticed that the interaction variance for cereal expenditures 
was the only significant interaction variance, indicating in this case a 
differential interviewer effect for different areas. Accordingly, the sig- 
nificant values were analyzed by area-investigator cells, and it was found 
that the abnormally high values were due to a single interviewer in one 
particular area.” Where replicated interpenetrating samples of this kind 
are used, it sometimes becomes possible to localize the error not only to 
a particular interviewer, but also to a particular area, so that the control 
of error is greatly facilitated. б 

A similar survey using interpenetrating samples was conducted in 
Nagpur in 1943. The design was arranged in the form of a randomized 
block, with five zones and four investigators each having approximately 
the same number of family schedules, about fifty, in each zone-investi- 
gator cell. F-ratios are shown in Table 87. Again the variances were 
divided by the error variance—the mean square within subsamples. 


TABLE 87 
F-Ratios or Variances iN Nacpur Елмпу BUDGET Inquiry, 1943 


ExPENDITURES 
Тотлі Мохтніу 
Source or VARIATION Income TOTAL Food Cereals 
11.06} 9.641 8.361 8.281 
0.21 1.55 0.91 95. 
0.95 1.03 2.10* 2.00 
2.96] 2.931 2.801 3.004 


* Significant at 5 per cent level. 
Significant at 1 per cent level, 
| In this case, the Zones were set up purposely to differ as much as pos- 
sible, but interviewer variation was negligible. As Mahalanobis expresses 
it, “Personal equations had been completely eliminated.” 

It is interesting to notice that the interaction variances found in these 
two studies are confirmatory of the theory and findings of chapters IV 
and V. In Table 87, significant interaction is shown in two cases— 
monthly expenditures for food and cereals, Since the zones were pur- 
posely made as different as possible and all the between-zone variances 
are significant, this may be interpreted to mean that the significant in- 
teraction between interviewers and zones is really evidence also of the 
existence of a “reactional” effect in the sense that the term reaction was 
used in Chapter IV, that is, an effect deriving from the reaction of a 
particular group of respondents to a particular interviewer and vice 
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concept of response error as a random variate has been used by Hansen, 
Hurwitz, Marks, and Mauldin to formulate a mathematical model for 
response errors and to derive formulas for estimating interviewer 
variance and total variance under this model.” The formulas are de- 
rived first for the case in which assignments are randomized within 
groups of interviewers. Since this is a condition which may be ap- 
proximated in practice much more often than randomization of as- 
signment of the whole sample over all the survey interviewers, the 
practical utility of the formulas is increased. : 

Under this approach, there is a "true value" for each individual in the 
population, defined in terms of the purposes of the survey. An “indi- 
vidual response” is the value obtained in a particular interview by a 
specified interviewer with a specified respondent at a given time, 50 
that the individual response will vary with any alteration in the survey 
conditions. The “individual response error” is the difference between an 
individual response and the true value for the individual. The vart- 
ability of individual responses is conceived to be random, so that the 
response error of a particular individual in a given survey has an ex- 
pected value (the individual response bias) and a random component 
of variation around that expected value. If the value to be estimated 
from the survey is an average or aggregate of the "true values" for the 
individuals in the population, this estimate will have a response bias, the 
difference between the expected value of the average or aggregate 9 
observed responses and the average or aggregate of the true values, and 
a response variance of the average of observed responses about the €x- 
pected value of this average. 

The analysis assumes that the random components of the response er- 
ror for different individuals interviewed by different interviewers are 
uncorrelated. It is true that there may be correlation between responses 
even in this case. This might occur because of the influence of a com- 
mon supervisor or common training for two different interviewers, but 
such correlations will probably have a negligible effect on sampling 
variances, It is assumed that response errors for different individuals 
Interviewed by the same interviewer may be correlated. The alterna- 
tive assumption of zero correlation between the random component о 
response errors of a particular interviewer would imply that there is no 
differential effect of different interviewers on responses and hence no 
Inter-interviewer variability. 

The model assumes that interviewers are divided into groups, each 
group being available to interview only certain classes of the popula- 


tion or i i : $ й ы 
ОГ їп certain geographic areas. A number of interviewers are sê 
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lected at 
Mena dc ege e чн pee group and assigned an equal number of 
с с A , ек among the sample individuals in the class 
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Sid the ae a s interviewed by each group is a random variate. 
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l.a $ CENE 
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all interviewers and divided by the degrees of freedom (7 — k). Desig- 
nate this by М. 

Weighted average of mean squares between interviewers within 
groups. First the sum of squares of deviations between interviewer 
means and group mean for each group is divided by the degrees of 
freedom for that group (k, — 1), to get the mean square for each group. 
Then the average of these mean squares is obtained by multiplying each 
one by k,, the number of interviewers in the group, summing the prod- 
ucts and dividing by &, the total number of interviewers. Designate this 
by My. 

Then we have 


Su = estimate of oy = Mra — Ми (2) 
л 
5% = estimate of c?, = Mg + 2 k Sur (з) 
n—1 k 
.A—-—À . ш 1 3 @ 
S’; = estimate of o? = — 5% + Sy (n — 1) 
o 


_ Mn ГЕ = k (М = Mat) 


n 27—1 n 


The first term of this last formula is the usual formula for estimating 
the variance of a sample mean from a random sample of z units pom 
an infinite universe, The second term represents approximately the 10- 
crease from taking into account intra-interviewer correlation, Testing 
the ratio (M,c/Mpr) of the average interviewer mean square to the 
mean square between respondents for significance by the F-test wil 
show whether significant inter-interviewer variability is present. 

Looking back at the formula for the variance of the sample mean, We 
see that it can be put in the form 


5e + ay) G 


: y one interviewer group—that is, if the assignments 
of all the different interviewers were equivalent samples of the enti” i 
population, сут/о?, would represent the correlation, p, between 1€ 
Sponses obtained for different individuals by the same interviewer. 


en esa y) (9 


7 


Now, if we had onl 
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This expression shows the analogy to cluster sampling, since it is the 
formula for the variance of a sample mean with a sample of & clusters 
a n : 
of ñ (- 3 units each. 
In this case (a single interviewer group) the estimate of the variance 
of the mean given in (4) reduces to: 


Mean square between interviewers 
ы сап 59 (7) 
n 


Ee is the same estimate of the sample variance used by Stock and 
OChstim in the analysis of variance approach. 


Reducing Effect of Interviewer Variance 

_ We discussed earlier some reasons why interviewer variability (or 
intra-interviewer correlation) may change with a change in the number 
of Interviewers, If we assume, though, that interviewer variability is 
independent of the number of interviewers employed, its effect on 
Заре estimates will decrease as the number of interviewers employed 
5 increased, Under this assumption, we could minimize the effects of 
ve viewer variability by assigning one sample unit to each inter- 

ewer, But increasing the number of interviewers would increase costs 
E training, supervision, and travel and require reduction of costs at 


s i i ! 
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We need to emphasize that the general inverse relationship between in- 
terviewer variability and number of interviewers, as well as the equa- 
tions for determining the optimum number of interviewers based on 
this relationship, apply to estimates of marginals only. If interpenetrat- 
ing samples were used, an increase in the number of interviewers would 
have the effect of decreasing interviewer variability in subgroup esti- 
mates and subgroup comparisons, inasmuch as the number of inter- 
viewers for the respondents of each subgroup would also be increased. 

But we have to consider the effect of an increase in the number of 
interviewers under normal survey conditions, where economy of time 
and money require that assignments be made in clusters of units. Under 
these conditions, if the number of interviewers is small, units assigned 
to each interviewer may cover a fairly large geographic area and hence 
may be fairly heterogeneous in character. Any systematic biasing 
tendency of a particular interviewer, that is, a tendency of the inter- 
viewer to obtain too many answers in one direction from al? groups of 
respondents, will tend to affect all subgroups leaving the subgroup 
comparisons unbiased. If the number of interviewers is increased, nor- 
mally each interviewer will interview in a smaller area and his re- 
spondents will usually be more homogeneous in character. Thus sub- 
group comparisons will tend to a greater extent to be comparisons be- 
tween respondents of different interviewers and will be affected to ? 
correspondingly greater degree by interviewer variability. Of course 
if the degree of interpenetration is increased in proportion to the Jn" 
crease in the number of interviewers so that the respondents of each of 
the larger number of interviewers are scattered over as wide an area 
as when fewer interviewers were employed, some reduction in the 
effect of interviewer variability on subgroup comparisons would result. 
But this would increase cost of travel between units and render the 
optimum equations inapplicable. 

In public opinion research where, for many analytical purposes, the 
greatest interest is in the functional relationships between classes of re- 
spondents, the application of this approach to minimizing variability n 
the marginals may therefore be unwise in some cases, since it may actu 
ally decrease the reliance to be placed on the comparisons betwee? 
classes, 

The mathematical model given here can serve as the basis for ап ар” 
proach to minimizing both bias and variance. A particular survey тау 
be regarded as subject to considerable response bias, but there may be 
menu dme that will reduce the bias, possibly with ao 

In cost. First each of the alternative methods can 
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tested in the field, and from these tests, the optimum values of n and k 
can be determined, as before, to minimize the variance for a specified 
total cost. The chief difficulty is in estimating the net response bias. Ex- 
perience or reasoning may lead to the conclusion that one of the 
alternative methods is probably subject to negligible response bias. As- 
suming this to be true, the bias for any other method may be estimated 
from experience or pilot studies from the difference obtained between 
this method and the method presumed to be most accurate. 

As an example of alternative methods, farm expenditures might be 
determined by direct questioning of farmers or by detailed examination 
of purchase records, the latter method being presumably more accurate 
but also more expensive. Actually, in opinion and market survey work, 
there would ordinarily be great difficulty in finding a practical alterna- 
tve to the usual survey techniques which would result in lower bias. 
“actual characteristics can sometimes be validated at great cost from 
Independent records, as in the Denver Community Survey described 
elsewhere in this report. Some alternative methods to the usual per- 
Sonal interview for ascertaining opinion were described earlier, par- 
ticularly in Chapters V and VI, such as the use of mail questionnaires, 
Secret ballots, self-enumeration schedules, “depth interviewing,” em- 
Doyment of interviewers with superior training and experience or 
oe superior qualifications, post-interview recording of oe 
hne eis, The results of experimental studies to compare a i Y 3 y 
s thods for validity have been inconclusive for the most part, although 
Sed have been indications that certain methods produce к ae 
anon S under certain conditions, as, for example, the oe ое 
Tes аар techniques, like the secret ballot, on cage ase ifte 
fius pe : prestige or questions of э highly aes be developed 
toms valuation of relative validity may So e аг 

"n Internal evidence of the data or from a priori reasoning ar py 
ia theory. The problem of finding such criteria has been dis- 

- earlier in this report. luation of relative va- 
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spondents interviewed by both methods. To obtain the final estimate, 
the ratio of estimates for the more accurate method to estimates for the 
less accurate method for the subsample would be multiplied by the 
estimate from all cases obtained from the interviews recorded under 
the less accurate method. We refer the reader to the paper by Hansen 
and his associates for formulas for the variances of such estimates. . 

The approach of Hansen and his associates to reducing error provides 
a logical mathematical framework for identifying and measuring inter- 
viewer error, and this clarifies the problem of error control. The ap- 
plicability of this approach in actual practice depends on the particular 
survey conditions. To summarize some of the limitations which usual 
public opinion and market survey conditions impose: 

1. Most surveys are multi-purpose in character. Estimates are usually 
desired for a number of major characteristics in marginal totals as well 
as cross tabulations to show the relationships for subclasses in the popU- 
lation. The possible conflict between the optima for marginals and for 
subgroup comparisons has already been discussed. But even where 
the chief interest centers in the marginals, the optima for the major 
characteristics will often differ widely. In the illustrative example given 
in the Hansen paper, the optima ranged from two to nine ишене 
for five labor force characteristics. If some average optimum is usc 
the combined efficacy in minimizing variance for a given cost will often 
be slight, е 

2. The number of interviewers employed on a survey must or dinarily 
be determined by administrative and operating considerations, particu- 
larly limitations of time allotted to complete the survey. For three of the 
five characteristics in the illustrative example, the optimum number 0 
Interviewers for minimizing the variance for a given cost turned out F 
be only two. In actual practice, it would be extremely unlikely to fin 
а survey for which the time limitations were sufficiently flexible to pe! 
mit the use of only two interviewers, each of whom had to cover thirty- 
two sample segments. Hence the optimum equations will be applicable 
only over the usually relatively narrow range permitted by the survey 
Conditions, Also the economics of public opinion and market surveys 
require the employment of a regular staff of part-time interviewer? 
whose assignments must be spaced so that they get neither too much 
Dor too little Work, and the number of interviewers to be employ ed in 
surveys cannot be manipulated to a great extent without upsetting the 
existing arrangements, 

The national cross sections used in most public opinion and market 


ee : 5 { r i 
surveys permit little manipulation of the number of interviewers " 


Reduction and Control of Error 343 


many localities where it is feasible to employ only one or a few inter- 
viewers, 

3. Determination of the optimum number of interviewers depends 
9n the assumption that interviewer variability does not change with the 
number of interviewers, an assumption which is probably valid only 
Over a limited range for the reasons discussed earlier. Also, in practice, 
whatever the effect on interviewer variability, the effect of increasing 
the number of interviewers is quite likely to be an increase in the net 
bias, since the additional interviewers usually have to be drawn from a 
different universe of persons with inferior training and experience. 

4. Many surveys are limited in scope and nonrecurring in character 
s that variances and costs cannot be estimated from a previous survey. 

ilot studies used for this purpose would be very expensive if satis- 
actory estimates of variances and costs, that is, estimates based on a 
sufficiently large sample, are to be obtained, and the conditions of such 
Studies do not usually simulate those of the final survey. Я 
s я Assuming that cost-accounting methods are capable of ел 

o Separately Costs per interviewer and cost per respondent, these unit 

ше probably not constant but vary in some manner with the num- 
СГ of interviewers, For example, cost per interviewer for training and 
Supervision would probably decrease as the number of interviewers in- 
wien but costs per respondent would go up because of the е 
ауе] between the units assigned to each interviewer, assuming that the 


assi « di" 3 
Snments were interpenetrating in equal degree. 


int ime conditions. Equivalence of assignments, even WIL : а а 
m “tvlewers, is unusual, though sometimes the overlapping о th 
з is sufficiently great that the necessary conditions for the 
“asurement of error may be approximated. A d bias, criteria 
"a using the approach to minimize both рна 2 е biased among 
ter termining the “most accurate method” or least a 
Native methods are very difficult to find. dent is in- 
* Under a double-sampling scheme, the fact that a respon! nder the 
Ved twice may result in a different set uf Sa if this 
more accurate method than would m he Mmm 
oF his = o used plore, ate ips ea of variances and differ- 
entia] Lt’ for consistency. Hence estima 


lases may Б 
е affected. 1 
і і г геѕропѕе ег 
то рне of these limitations, the mathematical model for resp : 
a 4 der the model can be 


n f ; rror un 
Used o the approach given for reducing error v нав 
Occasion by survey agencies when the necessary 
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approximately fulfilled or special survey designs are used, and the re- 
sults of such studies will have some applicability to later surveys, even 
when the necessary conditions are less closely approximated. 


Correction for Interviewer Bias Associated with Differential Net Effects 


Methods of measuring differential net effects of interviewers of con- 
trasting ideology, expectations, or group membership through the use 
of Chi-squared analysis and other techniques have been illustrated fre- 
quently in this report and in the literature of interviewer bias. Earlier 
we mentioned the suggestion of Mosteller and Cantril that final results 
might be corrected for ideological bias if we can make certain assump- 
tions, usually the assumption of equivalence, about the relative strengths 
of opposing biases. " 

But other methods of correcting the final results may also be used in 
cases where differential net effects have been demonstrated. One such 
method is the elimination of the data collected by some of the inter 
viewers on the basis of certain assumptions about which interviewers 
are biasing the data. Ferber and Wales describe a procedure of this 
kind used in a 1950 study of attitudes to pre-fabricated housing in the 
Champaign-Urbana, Illinois, area. The fourteen interviewers were re- 
quired to fill out the questionnaire themselves before the survey. Re- 
spondents' replies were classified according to the answer of the inter- 
viewer and Chi-squared values were computed to determine whether 
interviewers obtained significantly more replies in line with their owu 
opinions, using a 5 per cent level of significance. On four of the eight 
questions, over-all biases were indicated by the tests. To determine bias 
for individual interviewers the distribution of the replies turned in by 
each interviewer was compared with the corresponding distribution tor 
the total sample, excluding the replies of that interviewer." If the dis- 
tributions differed significantly, the interviewer's returns were taken 
to be biased. Final results for these questions were then corrected by 
eliminating the data obtained by the "biased" interviewers. 

This procedure, of course, involves the dubious assumption that the 
distribution of replies obtained by the other interviewers is unbiased: 
Furthermore, as we have pointed out before, significance tests often 107 


dicate bias w i i icat 
"ias bias where none really exists, so that unrestricted application i 
this procedure is not recommended. 


Esti: 3 ‚ ; 
mates of Error Based on Experience or Independent Information 
Sometimes the effects 


5 of interviewer error and bias on final estimates 
may be removed or red 


uced by adjustment or qualification of the esti- 
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Mates on the basis of experience or of independent information. The 
only instances of such procedures that we can cite from the past litera- 
ture involve adjustments for the total system of errors, i.e., sampling 
and response error in addition to interviewer error. However, in theory, 
such adjustments could be derived purely for that component of error 
due to interviewer effects. As an example of an adjustment for the total 
а of errors, we may mention the age-sex adjustments of labor 
Orce estimates by the Census Bureau. Results of the monthly labor 
force Surveys of the Census Bureau are adjusted by inflating the esti- 
Mates of labor force characteristics for each age-sex group to inde- 
pendent current estimates of the total number of persons in that age-sex 
Stoup derived by actuarial methods. 


mua and market survey agencies 
а sampling to obtain too few respondents in the lower educational 


Stat А à : : 
us and lower socio-economic categories, and have sometimes Cor- 


rec exui д ; 
ted the results by re-weighting the data for the various economic 


Ore М 4 4 A е : 
ducational groupings, in accordance with independent information 
lation. This 


proce educational Or economic distribution of the Maps A 
showed n was used by Gallup in the 1948 беи polls. : uri 
high-sch 9, 46.8 and 35.4 per cent of the respon ents in ob: gm 
the сеп ool and grammar-school educational groups um у, = 
and ux educational distribution of the population age tw ЖОЛ E: 
College. r showed 13, 42 and 45 per cent in the three groups. Ot tte 
Vote E -educated respondents 61.6 per cent indicated E m arde 
grou or Dewey, compared with 43.1 per cent for the high-sc 
mien 42.1 as the per cent for the entire sample. —I" 
Brou “Diving the percentage intending to vote ew je x 
each, к y the corresponding Census percentages for t e popu о 
е к gave a revised estimate of 40 per cent ea — 
for бо sample. Another example of the айне = f а m 
to allo Suy Gallup was the correction of estimates of v = с 
Who s W for the inflation known to occur in the miter e ri Lnd сш 
ome they voted in 1944 and in the number who say eim 
Dag, io The estimated inflation is determined by ч y: Н о Be 
са er of past surveys, and the inflation factors see then | aede 
Wej “tual distribution of the major-party vote in 1944 : Jo S 
‚ IBhts to be applied to the groups set UP on the basis of 1948 voting 


i : 
Ntentions, 
i 5 x & " 
S ecl nother example of such an approach is available in the quur 
Carlie, Of the 1950 Census. We alluded to the quality check ipei cent 
> but there its relevance was to measuring gross GT NEL ‘eltechs, 


are aware of the tendency in 
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rather than as procedure for reduction of error by empirical adjust- 
ments. An estimate of error is obtained by comparing the results of the 
original enumeration with the results obtained by intensive re-interview- 
ing of a sample of the original households on selected items, using 
specially trained, superior enumerators. Naturally, the superiority of 
the check interview is assumed to yield the more accurate results. On 
the basis of these check figures, qualifications of the findings can be 
published, and the original results can even be corrected.” 


Use of Scale Scores To Minimize Bias 


The recent development of question scales for the measurement of 
attitudes," for example, the Guttman scales, may prove to be useful in 
minimizing the effects of interviewer bias under certain conditions. If 
the bias is not systematic in character, that is, is not manifested uni- 
formly by the interviewer, but tends to occur randomly or is situational 
in character, then we might expect that the employment of indices 0! 
scale scores from batteries of questions would tend to attenuate the 
effects of bias, since the random bias occurring on one question might 
be lessened by “burying” this question in with a battery of others. In 
this case, the use of the scale scores tends to average out the biases, 10- 
suring against the risk from reliance on a single question. 4 

There аге, of course, instances where the bias is of such a systematic 
character that a scaling procedure would simply compound or aggra- 
vate the effect. Such would seem to be the case in instances involving 
attitude-structure expectations as exemplified in Chapter III. However: 
there was also considerable evidence presented in Section 1 of Chapter V 
that bias varies with situational factors, and in Chapter VI that bias may 
simply be random in character. For such instances, scaling would be 
recommended. 

An actual example of the value of using scales or batteries as at- 
tenuators of bias can be constructed from some data of the Denver 
Community Survey not previously presented in Chapter VI. One 
omnibus question contained ten subparts asking about the respondent 
degree of interest in various public problems. Three of the items 
Tepresented logica] components of a battery or scale of interest in 
local affairs, These dealt with city planning, the public school system: 
imo editi of the city administration. On the first two of ar 
were highl аа іп the results obtained from equivalent m 
вазы, A miss ipn that the results per item b Te 
Жоон phe ES Шу wou d have been reduced, however, 1 1 

pooled into a common scale of interest in loc? 
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affairs, since the deviant results for given interviewers were not con- 
sistent over the three items. This can be indicated by ranking the nine 
interviewers in each sector on the degree of interest their samples 
Manifested in each item and intercorrelating the ranks over questions, 
Sector by sector. The fifteen co-efficients ranged from —.13 to .80 with 
а median value of .33. Since the expected value of these co-efficients, 
€ven if there were no interviewer effect, would be of some positive 
Magnitude because of the sheer generality of interest among human 
beings, the median value of .33 is all the more compelling in arguing 
that the ranking of respondents by the use of a scale would be less 
affected by the interviewers’ own bias, then ranking by individual 
questions, 

Additional evidence of the attenuation of effects through the use of 
Scores based on the pooled answers to a battery of questions was avail- 
able in Chapter VI, in the finding that there was no difference in the 
relative reliability of scores on p indices when respondents were 
IC-Interviewed by the same vs. different interviewers. This was in con- 
‘rast with the finding that answers to single questions were affected 
a Stematically by the particular interviewer used. . А 

sides the possible use of the scale scores for attenuation of bias, 
d might also provide a better measurement, ог test, of whether bias 
15 Present, Chapter VI offered a good argument for the belief that 
Many of the findings of interviewer bias may represent simply chance 
“ctuations, Thus the erratic character of results when testing for bias 
on individual questions could be decreased by the employment of the 


Mor : 
© stable scores for a whole scale of questions. 


APPENDIX A 
Procedural and Methodolo gical Data Bearing on the 
Qualitative Materials for Chapter II, the Definition 
of the Interview Situation 


The purpose of this appendix is to describe the procedures Бу 
which the phenomenological reports drawn upon in Chapter II were 
collected. In so far as readers are impressed with the value of 2 
phenomenological description of the interview for future rescarch into 
interviewer effect, this appendix might serve as a guide to others who 
would collect new data to add to the fragmentary picture we now 
have. In addition, the reader can assess the quality of the original а 
ings in the light of the procedure and specific evidence to be presente 
on the problem of validity, 

Admittedly, the procedures necessary to obtain the type of data We 
were seeking will never satisfy the positivistically minded reader in 
way that experimental and statistical data would. But experimental an 
Statistical data would never have been adequate to our purpose. po 
sought the subjective view of the interview situation, and this called 
for subjective data which for some readers, unfortunately, has the 
connotation of unreliability. For such readers, nothing would buttress 
their faith in the data. But in relation to such categorical criticism, 1t 
should be pointed out with Clarity and emphasis that the use we made 
of such data Was tentative, Generalizations, in so far as they were ad- 
vanced, were qualified. The data were the basis essentially for specula- 
tion and theorizing; the verification of such theories involved other 
more orthodox procedures of a statistical and experimental sort. The 
Support for these Suggestive findings in Chapter II, therefore, rests 
ultimately on the entire body of evidence in this project and nof 
merely on the evidence of the quality of the procedures here reported. 

hree procedures were relied upon for reconstructing the definitio” 
of the situation: First, intensive interviews with interviewers to obtain 4 
picture of the totality of their experiences, Secondly, a reconstruction 
of а series of particular single interviews through reports from both 
Patties. Third, accounts of the interviewer’s experiences while listening 


to а ipti i i i 
га transcription representing a recorded interview. Each of thes¢ 
will be discussed in turn, 
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1. THE INTENSIVE INTERVIEWS WITH INTERVIEWERS 


Sampling Considerations 


Seven such interviews were conducted. All seven of the interviewers 
Were long experienced, professional survey interviewers. Five of them 
Were women. Five of them had had their main experience in the New 
York Metropolitan District, one had worked in the Middle West, and 
the other had worked in every conceivable area. All but one were on 
Ше staff of NORC (the non-NORC interviewer had had longest 
Xperience with intensive interview surveys for government agencies), 

Ut five of them had worked for a variety of agencies doing field work 
of all types. It is obvious that they constituted no representative sam- 
me Survey interviewers. But this is no serious criticism. The inter- 

3 were deliberately restricted to interviewers who would have the 
Steatest fund of experience as a basis for communicating a richness of 
ral to us, Further, the interviewers were deliberately selected in 
tie of ability to reminisce, to introspect, to po di pad Mmi 
of dhe Teport it to us in detailed terms. If one see нў. р Есе m 
Was meia "s се -— А ее кле ч, is perfectly 
Possible, B note а ic icq the stimulation for a 
the : Sut out of such revelations might come the. tapes 

Оту, which would be regarded as provisional until verified in pr 


Cise à 
Ways and found to have generality. 


The Procedure Followed and the Validity of Reports 
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leq E to start by telling us what he felt was og el e 
o : А У жыйн 
ма hh um рн of ан where there were 
по 81у about, After this, or in occasional instances v saan 
©ntaneous remarks, the suggestion was given iie see У E 
Wor me of his experiences by thinking back to i ihe emet 
^: Баана, Hr Sar, AERE iw spontaneous reports 
; : А : E 
Were A Pon his recollection of his feelings. chrify te cuts à 
€rrupted tod: by probes to Cl 5 j 
Bener, Pted periodicall P } ЙО 
“tally the йы: ааб with exceedingly little structuring, 
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and the order and content of remarks were determined naturally, The 
answers were recorded verbatim by manual procedures. 

No standardized questionnaire was used. While the attempt was 
made to cover particular areas of experience, wherever possible no 
questions on the particular area were mentioned until late in the inter- 
view so that much of the material was liberated spontaneously. I 15 
unlikely, therefore, that the phenomena reported are in any consider- 
able degree artifacts of direct questioning. The particular areas which 
we attempted to cover included: the gratifications they derived from 
interviewing, the interviewers’ reactions to the respondents’ attitudes 
and to the treatment respondents accord them, their beliefs about the 
existence of certain attitude patterns within the respondent and in 
different groups of respondents, the role interviewers feel it is de- 
sirable for them to assume, their attitude toward probing and experi- 
ences in probing; the reaction of the respondent to the approach, the 
questions, the interviewer's personal characteristics, and to certain 1n- 
terviewing circumstances; and finally in the sequence some direct ques: 
tioning about bias, Naturally, in such a lengthy interview, with a 
minimum of structuring, and with the respondents themselves being 
interviewers, there was a very discursive quality to the reports, and 
many other areas of experience were brought into discussion. . 

A number of questions immediately arise with respect to the quality 
of the reports given: 

Bias due to tbe interviewer-subject wanting to present an account 10 
his employer that would insure or enhance his security.—Since the 
interviewer who conducted these interviews was known as a perma- 
nent member of the NORC staff, it might well be that an interviewer- 
subject would deliberately conceal certain kinds of experiences an 
behavior out of fear that such revelations might be a basis for dis- 
charge. With respect to this possible error, it might be pointed out 
Чие Шеге is no proper norm for interviewer conduct in most of the 
areas discussed to which the interviewer-subjects could orient them- 
selves. Explicit admissions of interviewer bias constitute the only 
violation of known norms, and this was incidental to the main cOn- 
tents of the interview. In addition, the general atmosphere of the inter- 
view was exceedingly permissive, and the subjects with one exception 
were on exceedingly good and friendly terms with the interviewer 
Finally, it may be noted that in the very place where concealment 
aan aa? to occur, in reports of flagrant bias or ere 
sublets of ed ls ures, there were explicit reports by two о 

ehavior. We are quoting these in order to convey 
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the lack —— 

G ed нны is Bie interviewer-subject in the situation. 
йй] әй i P 1 taneously: I'm afraid I often reword the questions. 
си. А t printed. But then when they look blank—suppose 
don't say an и = : ‘how do you feel about another war'—maybe they 
falling and e So then you say ‘well, when you think of bombs 
Woman replied eee ar joue husband going to war,’ well, then, as one 
of a sean?” , she said: ‘I wake up every morning being scared stiff 

M admi : 
view eth a koe Аа In describing how he conducted an inter- 
viewed her. Whe 2 respondent, he reports: "I went ahead and inter- 
explain it to he n she didn't understand a word, I would have her son 
Clear to him d and with simple words and pantomime I would make 
I realize I th he what was meant by the words in the question. . . . 
E am acia] in arrogating to myself the authority to make 

ater i ^ 
must с dm р. that he may reword the questions, he remarks: "I 
the good b to a shortcoming. I do not believe that I sufficiently do as 
indefensip] ook suggests. ... I confess that originality is probably 
€, but it is a freedom I take upon myself because I am quite 


sure. j 

»IN my о : ? 

th y own mind, that I have sufficient understanding of words and 
he question differently with- 


ting, and to selectivity.— 
ere used to obtain a pic- 
f their experience, and not 
onsequently, the problem of 


gh the eyes of the inter- 
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n a report of the objective facts. The intrusion 0 
in relation to most 


ti 
the шень was exactly what was 
quality up of these interviews. Bu 
have Consid езе protocols as detailed reports of 
gross, 7 erable validity. They аге exceedingly г | 
gotten red pictures as would be the case in the recall of distant and 
events. In addition to detail, the experiences Were elaborated 
ely 2500 words in 


аср 
Teat 1 
е . 
length ngth. Four of these accounts ran approximat 
i mentioned in the text, the 


; tw 
erview, 9 of them ran 7000 words, and, a$ : 
t with M was of such detail that it exceeded 17,000 words in 
jons as “I had a 


еп 
> Th ] : 
experi material is full of such detailed recollect! 
€nce in Williamsburg once”; “On Survey 152, the women 
»; “Оп one survey I was 
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in a C-D neighborhood—I was speaking to a woman—another woman 
overheard it and burst forth and said ‘stop talking, she’s a Communist. 
The descriptions seem to be fluent accounts of experience, reported 
with great ease. 

Faulty inferences and analyses made on the basis of examining the 
protocols——The treatment of the data in these interviews was not 
statistical, Data were not coded or tabulated in any uniform way- The 
material was simply examined and inferences drawn about certain 
phenomena. Since no claim is made for the frequency of these phe- 
nomena among the seven interviewers or among interviewers in gen- 
eral, it was felt that statistical treatment was not essential. These reports 
are presented as case material from discrete interviewers. The inferences 
may at times be faulty, but the original data are presented in detail um 
the text, so that the reader can easily judge for himself. The original 
interviews are, of course, on file at the National Opinion Research Cen- 
ter and can be examined for a check on the present analysis. 


2. THE CASE STUDIES OF PARTICULAR INTERVIEW SITUATIONS 


Sampling Considerations 


The three case studies presented in the text are part of a larger series 
of descriptions of particular interview situations. The series was base 
on phenomenological data covering the mutual experiences of both 
respondent and interviewer in fifty actual survey interviews. These 
Particular interviews were conducted in the course of only one national 
survey on political issues at a particular moment in time. The fifty sub- 
jects were selected from those who had been interviewed in the three 
sample points, New York City, Chicago, and Denver, by a total of ten 
interviewers and further restricted by the fact that only certain r°- 
Spondents were co-operative enough to submit to the procedure to be 
reported below, The reader might well raise certain questions about 
the sampling. The interviews are not many in number and are base 
on the work of only a few interviewers, interviewing only respondents 
in big cities. These interviews may also be biased with respect to the 
sampling of conditions of the interview in that they refer only t9 
Situations where politica] contents were collected at a certain historica 
энни Moreover, they are obviously biased in that some respondents 
ifo sake qualify for inclusion in the group we initially planned : 
ме cee As co-operate or were not available, and in that ha 
text. Criticisms a Mes from the larger series for presentation In is 

ch grounds of sampling do not seem crucial. Tht 
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material, lik A аир. 
final а — interview data, is not presented as a basis for 
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fied for the re-interview. His name was not recorded on the original 
interview, partly because of established practices about anonymity and 
partly so as not to warn the original interviewer of the likelihood of 
the respondent’s being revisited. In the re-interview, the respondent 
was informed that we believed he had been interviewed on one of our 
recent surveys, and that we would like to know his reactions so as to 
improve our general field procedures. The re-interview was ides 
by asking him if he remembered the original interview. This provides 
an opportunity to study the impact of the total experience and Ыз 
orientation to specific features of it. Later questions dealt with his 
feelings about being interviewed, his motivation for accepting the 
interview, his reaction to the experience, the way he conceived of the 
situation (e.g., like a quiz, an argument, a friendly conversation, etc); 
his reaction to the interviewer as a person, and his report of the inter” 
viewer's behavior, particularly with respect to the communication 9 
bias. In general, there was a deliberate parallelism in the coverage "A 
the original interviewer's report and in the respondent’s re-interview 
So as to obtain the mutual views and appraisals of the same aspects 2 
the interview situation. These procedures presumably yielded data on 
the undercurrent of the interview situation, Of course, we also had F Е 
actual record of the respondent’s answers to the questions in the orig; 
nal interview, and we had also obtained the original interviewer's ОМП 
attitudes by having him complete the regular questionnaire for ud 
Survey. Data were thus provided for evaluating the disparities the 
existed between the two parties in their ideology and group member- 
ship, and the measured attitudes revealed in the original interview 
could be examined in the light of the interview setting in which they 
had been elicited. Many questions about the validity of these reports 
arise; 

Biased reports of the original interviewers experiences in order t0 
protect his own employment.—While the original interviewer = 
instructed that our purpose in having him complete the questionnalr x 
about the situation was purely for improvement of general procedur E 
and although he seemed not to sense that a re-interview would occu 
he may well have felt that this was a method of surveillance over MS 
performance, Consequently, he may have presented distorted fpi 
i енны ina better light, This factor would have operated pd 
ven vei" associated with flagrant biases on his Bes, йй? 
mheala ar y incidental to our purposes. This source О ctions 
RE CS e, aeq to reduce reports of unfavorable rea d in 

» and reports that the interviewer himself reacte 
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hostile fashi 3 

clear Bos im rem it certainly did not eradicate the latter 1 
= reduced ad ort in the text of “The Creep.” is ie 
er ere et ea ey in 

ade to co terview an re-interview.— Ev 
as possible, p= me re-interview as soon after the cuu рен бын 
respondents meis ever, because of the difficulty of finding dis аа 
intervened, The E for a re-interview, several days ДЕБЕ 
two to eight day ime between original and re-interview ranged ы 
studying rhe m нөн а median figure of five days. For purposes of 
i terms of агу. mo of respondents, this was not ideal, but 
hat the isi g the izzpact of the experience, it yielded the finding 
forgetting, Ee ce was soon dissipated. With respect to losses due to 
е time lag. It our impression that any lack of detail was not due to 
à MAL a to be more a function of the particular respond- 
in om the Eig: pu the experience. Those who were detached, for 
: ere the ones ae keen who were lost in their private worlds, 
nir qim remember. Table 88 presents some quanti- 
ence of this possible error factor by showing 
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such attempts, the impression was that many respondents were жш 
cious of our motives and did not want to jeopardize the employmen 
of the original interviewer. There seems therefore to be a quee 
of reports of feelings of hostility to the original interviewer, үз e: 
reports that the interviewers engaged in practices which vac mar 
might sense were against the rules. That such reports by respon бош: 
about unsatisfactory behavior on the part of interviewers, or a á 
unsatisfactory reactions to the interviewer, were not completely y 5 
pressed is clear from the two case histories presented in the text— чы 
“Hen Party” and “The Tough Guy.” Nevertheless, there seems to d 
no general control over this source of error, and certain findings m 
be qualified in the light of its operation on the respondent. — T 
Inability of tbe respondent to separate bis reaction to tbe re-intei spre 
itself from bis report of feelings in the original interview.—Just as am 
original interviewer created a certain atmosphere and effect, а in 
must the re-interviewer. Perhaps the reactions to the new айе 
some way have contaminated the memory of the original event. his 
seems to be some suggestion from reading the re-interviews that 
did happen. А indi- 
In general, the re-interviewer was a somewhat more skilled 1 о 
vidual, so we may assume that the atmosphere he created was нард 
better rapport, perhaps greater social interaction, less hostility “ In 
disparity between respondent and interviewer, and less tension. it 
occasional instances, the respondent did react with greater hostility. : 
the re-interviewer. In all these instances, however, the effect of Sí th 
biasing factor would be to distort the respondent's statement of 
original situation in a predictable direction. lyst 
In reconstructing the case histories of these situations, the ai de 
Teported wherever he sensed the operation of such a factor an fess 
material reported in the text was evaluated in that light. Neverthe!c*^ 
such a source of error may still be operative. —— 
Method of integrating materials independently derived from дагй 
viewers and respondents. "The foregoing discussion covers the ies «i 
types of response errors that might have affected our reconstruction ot 
the Interview situation. However, another major possibility of ет Е 
arises during the analytic phase of the work. The mutual reports ae 
interviewer and respondent were only the raw data for the phenome? al 
logical description of the concrete interview situation. The gre 
descriptions were derived by an analyst who immersed himself in és 
four lengthy sets of materials—interviewer's description, responden 
description, interviewer's expressed attitudes, and respondent * 
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pressed atti = 
situation, е ks wrote a reconstruction of the original 
tabulated. But this st purposes the data were tabulated and cross- 
Мерлан amd ато деса processing was found inadequate to the 
л керей Hide acide a frg of the 
materials into a en on the analyst's sensitivity in integrating these 
the analyst, and 1 rent picture. At first, no guiding scheme was given 
the special E he simply read each case separately for suggestions of 
Schema sete zs involved. After much reading of the materials, a 
final cases, зис} veloped for the description of the situation, and the 
hea dings toe ipie reported in the text, were analyzed under these 
kb obvisi he descriptive report of the situation written. 
the analyst to z spek a procedure that there is much opportunity for 
interpret the dom Tq his bias in the interpretation or simply to mis- 
read the identi a. The check upon this was to have a second analyst 
the first anal ical materials and examine the interpretations given by 
nalyst and confer with him. All the materials presented are 
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th at least the combined judgments of two analysts, and thus 
c interpretations. 
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idden obe S^ by hidden mechanical recordings of the event or by 
reliable, { Servers’ ratings of the event. But such procedures, while 
View, Ti vould have given a picture of only the externals of the inter- 
he inner world of the interview would have been inevitably 
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ence in surveys. The subjects were chosen deliberately because of the 
belief that they were sensitive individuals who could co-operate and 
would be capable of analyzing the flow of their experience and making 
it articulate to us. It is obvious that the two are not in any sense a 
sample of interviewers, but again it should be stressed that their re- 
ported experiences are not the basis for firm generalizations. 


The Procedure Followed and the Validity of the Reports 


The background of the procedure used in this study is reported in 
detail in chapter III. Each interviewer-subject listened to two transcrip- 
tions which presumably were obtained during actual interviews. Each 
was instructed to record the answer of the respondent on a copy of the 
questionnaire which corresponded to the questions used on the tran- 
scription. These interviews had been produced artificially by a profes- 
sional radio actor of long experience acting as respondent and reading 
a set of prepared answers to an interviewer who questioned him. They 
were specifically designed for an experiment on attitude-structure €X- 
pectations and consequently with the exception of occasional am- 
biguous or contradictory answers, they conveyed pictures of two 
contrasted types of respondents, each with a unified pattern af #05 
tudes, The subject was instructed to report whatever came to his mun 
in the process of listening to the interview and recording the respond- 
ent’s answers, Whenever necessary, the transcription was interrupte 
for as long as the subject cared to talk, and if he wished, a portion 0 
the interview was replayed for him. Such playbacks tended to destroy 
some of the unity of the original interview and to give it a fragmentary 
Character, 

b E prepared list of questions was asked. Periodically, remarks made 
yt 


1 Н H H " n 
points in the transcription that were regarded as crucial moments ! 


. H H 1 1 i е 
necessary to Inquire whether the interviewer-subject had anything к 
Say. But for the most 


judgment were s ontaneously reported. While the purpose of the 
procedure was ki ob hoch f Е 


: á : -poses 
5a. dis à luded details as inconsequential for our purpo$ s 
at ne SH to the “hmm” sound made by the original interview? 

Point. Consequently, we can assume that the phenomenon 9 
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а number of ways in w xperiences in the real interview. There were 
jeopardize the е n which the artificiality of the situation might 
ideis "s ilts, and these will be discussed in turn. 
Consequent за 0 а transcriptions avere simulated interviews and 
jects reported ciality in the report.—Neither of the interviewer-sub- 
there is dea suspicion’ of the transcriptions. In their accounts, 
and much attrib reference to the supposed interviewer and respondent 
recognized tha ud to them of various characteristics. Neither subject 
oth transcri £ te respondent was in actuality the same person on 
Viewers” wj | tions, and one subject contrasted one of the “inter- 
th the other, despite the fact that they were in actuality 
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Factors in the situation accentuating tbe formation of an impression 
of tbe respondent.—Two factors might have worked in this direction. 
Since we wished to highlight the dynamics of such cognitive processes, 
it was felt necessary to magnify the pictures presented. Consequently, 
the two simulated interviews were deliberately with contrasted types 
of respondents and the characterizations were somewhat extreme. Since 
some respondents met in real life would have less integrated ideologies, 
these transcriptions might convey an exaggerated picture of the opera- 
tion of attitude-structure expectations. 

Granted that this is true, it does not jeopardize the inferences drawn 
in Chapter II. Conclusions are not drawn that such expectations always 
occur, or frequently occur. The phenomenological data were intended 
to demonstrate that they did occur, and something of their dynamics, 
and there is assuredly in real life a certain number of respondents of the 
type pictured on the transcriptions. 

In addition, such criticism is predicated on the assumption of the 
rarity of these types of respondents in the normal opinion survey, "s 
the reality and frequent occurrence of such extreme types is We 
known to all in public opinion research. The fact that many of the 
opinions in the transcriptions were taken from answers actually abe 
tained in past Surveys supports this point. Moreover, data presented in 
the original published account of the study show that the characteriza- 
tions Were not always regarded as extreme, so that this error may not o 
85 serious as would at first appear. Whether this bias is completely 
compensated for in magnitude by the factors previously mentione 
which minimize the formation of impressions is not known, but 0 
necessity the total error must be reduced to some extent. 
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companies. And they are perhaps more highly motivated in other 
respects because they tend to dislike consumer and market studies and 
to take particular interest in the types of surveys conducted by NORC. 

NORC performs no market or consumer research, and all its surveys 
are financed by means of foundation grants ог by such clients as gov- 
ernment agencies, universities, or private institutions of an educational, 
charitable or scientific nature. The questions that NORC interviewers 
ask, therefore, generally concern social, economic, or political issues. 
Methodologically, however, the type of question and format of the 
questionnaire do not differ materially from those employed by any 
other market or opinion research agency, and essentially the same 1n- 
terviewing rules are followed. All interviews are conducted face-to- 
face, with the interviewer reading the questions and then recording on 
the questionnaire the respondent's answer—either by reporting his 
language verbatim or by checking or circling the appropriate pre-code. 
Sometimes all of the questions concern a single broad issue or subjects 
sometimes they take up a variety of topics which may not be closely 
related. At the conclusion of each interview, a series of factual ques 
tions such as age, education, occupation, etc., are asked of the respond- 
ent. Though the majority of the questions are pre-coded in form an 
offer the respondent his choice of two or more suggested responses, 
there are frequent subquestions of the “Why do you feel that way?” 
type, and occasionally there will be other open-ended questions invit- 
ing a free-answer response. Some of the questions are factual in nature 
(i.e., “What newspaper do you read?"), but most solicit the person's 
opinion. Interviewers are encouraged to avoid “No opinion" or “Don’t 
know” responses, and to urge the respondent to consider the question, 
to answer it “Just in general” or “Taking everything into considera- 
tion,” and to select the one alternative that comes closest to his ow" 
opinion or impression. Many of the pre-coded questions are of the 
dichotomous type, but others are in the form of a scale, and some 
occasionally require the use of a card on which three or more some 
what lengthy statements or alternatives are presented for the respond- 
ent’s choice. 

All of the NORC interviewers have been hired in person; none wa 
employed by mail. The hiring agent was in most cases one of the 
salaried field supervisors in either the Chicago or New York office 
although about one-fourth of the interviewers were hired by а "ige- 
gional supervisor”—another NORC part-time interviewer, but one 
with several years of NORC experience, who has been entrusted sim 
supervisory duties in the general geographical area in which she r esides- 
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About one-third of the staff were hired as a result of their independent 
Inquiry and application; they wrote in or appeared at the office seeking 
employment as interviewers, and when openings occurred on the stati, 
they were hired. The remaining two-thirds were sought out by the 
NORC representative, most usually through inquiries from local offi- 
cials or the heads of community organizations. Except in the few cities 
where NORC maintains offices or regional supervisors, all hiring was 
accomplished on “field trips” in which the NORC representative would 
visit the town or city where new interviewers were required. In such 
Cases, approximately fifteen or twenty applicants are usually screened 
for every three or four that are hired. 

All of the NORC interviewers have received training in NORC 
techniques and procedures under the personal direction of an office or 
regional supervisor, and except when large numbers of interviewers are 
being trained for a special study in a particular locality, the training is 
always given individually. The amount of time spent on this training 
has varied from a single afternoon to several days, depending upon the 
applicant's aptitude and experience and the amount of time available. 
In general, the procedure is as follows: After studying certain basic 
Instructions and preliminary materials and after a short talk by the 
Supervisor, the applicant obtains, by himself, two or three trial inter- 
views on the NORC training questionnaire, the first with a friend or 
Telative, the last with a stranger. These interviews are subsequently 
Criticized by the supervisor, with appropriate comments upon any 
Obvious errors or weaknesses. The applicant then interviews the super- 
Visor, who gives prepared answers of a difficult or problem type and 
Who acts, in general, the part of a difficult respondent in order to test 
the applicant's ability to handle a variety of situations. Following this 


Interview and discussion, the applicant is taken into the field and di- 
Tected to obtain two or three interviews with strangers of varying 
e supervisor, who notes any 


soci _ i 

clo-economic levels in the presence of th 

Particular errors or weaknesses and later comments upon them. The 
demonstration inter- 


Supervisor himself may often give one or more t n 
Views as an example. A final discussion between the two, in which any 


remaining problems or difficulties are taken up, ends the training. 
Once hired and trained by the supervisor, the new interviewer, un- 
less he lives near by or resides in a city frequently visited by NORC 
Personnel, usually is completely without personal contact with the 
Office. He may at long intervals be visited by a traveling supervisor, but 
Most members of the national staff have had only mail contact with the 
office since they were first hired. This appears to be a common situa- 
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tion among nationwide interviewing staffs, although some agencies 
have been able to meet the problem better than others through periodic 
regional conferences or through the employment of a full-time travel- 
ing supervisor who can in the course of time visit almost all of them. 

In general, there is no personal supervision of the NORC inter- 
viewer's actual work. Unless he should be a new interviewer in a large 
city, working under the direct supervision of an office or regional 
Supervisor, he receives his assignments by mail, directly from the 
office, and after completing his interviews, returns the material by 
mail, directly to the office. He works alone, from written instructions, 
and the results of his work are entirely dependent upon his own skills, 
initiative, and understanding of the NORC directions. The names of 
his respondents are not recorded, and unless his interviews reveal some 
suspicious pattern or otherwise lead the office to suspect fabrication, 
there is no direct check on the validity of his calls. 

To offset the lack of personal contact and supervision once the inter- 
viewer is enrolled on the staff, NORC has instituted a variety of quality 
controls and morale-building devices. Each interviewer, for example, 
receives at the time of his enrollment on the staff a hard-cover copy 9 
the field manual, "Interviewing for NORC." This manual, published in 
1945 and revised slightly in 1947, is the interviewer's “blue book.” Its 
150 pages cover every aspect of his work, and he is held responsible 
for a thorough mastery of its contents. A 100-item “True-or-False 
test has been prepared to test interviewers’ familiarity with the manual, 
and while the interviewer is free to look up any doubtful answers, mere 
reference to the manual for the correct response achieves one of the 
purposes of the test. In addition to the basic manual, detailed specifica" 
tions, or specific instructions for that particular survey, accompany 
each Interviewing assignment. These specifications, which usually 10- 
clude six or eight single-spaced mimeographed pages to cover Җ 
twenty-question questionnaire, tell something of the background of the 
survey and its purposes, contain general advice and suggestions on how 
to handle particular problems which may arise, and discuss each of the 
separate questions in detail. The specifications are written on the basis 
of the office’s pretesting experience, and they carefully instruct inter 
Viewers on the proper handling of particular types of vague, qualifie з 
ог irrelevant responses which may occur. The precise meaning 4? 
objective of each question are elaborated for the interviewer's benefit 
and occasionally specific alternative phrases are authorized, in the 


à d 2 it is 
9 that certain respondents do not understand the question as it 
worded. 
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Every interviewer knows that his interviews receive a rigorous ex- 
amination and analysis in the office, and that his work is “rated” from 
the standpoint of quality. In actual practice, not every interviewer is 
rated on every survey; but all new interviewers and all “borderline” 
Interviewers have each of their assignments rated, and even veteran 
members of the staff are rated on every alternate assignment. These 
office ratings cover the interviewer's handling of frec-answer questions, 
the degree and manner in which he probed replies which were not 
clear, relevant, or specific; the number and type of comments he 
elicited on pre-coded questions, the degree to which he seems to be 
Teporting completely and verbatim; the care with which he studied the 
Instructions and filled out the questionnaire, the number of checking 
errors or omissions he made, the clarity and completeness with which 
he described such characteristics as “Occupation”; and his sampling per- 
form ance, which on probability surveys would include his following 
9f instructions and the care and accuracy with which he filled out his 
orms, and in quota sampling, would include his accurate filling of the 
assigned quotas and the representativeness of his cross section in terms 
of such unassigned characteristics as geographical location, education, 
Occupation, etc. These ratings are recorded in detail on “rating sheets” 
and are the subject of a considerable amount of correspondence from 
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education program, special attention always has been given is bs 
problem of bias. Applicants with obviously biasing pora e siis 
never hired, and the new interviewer is indoctrinated early m 15 ea 
ing with such precepts as “Never suggest an answer, Ask B ето a 
exactly as worded,” “Never show surprise at a person Ж RC "a 
“Never reveal your own opinions,” etc. The index to the der 
terviewing manual lists no fewer than twenty-five separate — eu 
to "biasing factors," and entire sections of that volume are € 
two areas of interviewer performance in which our studies have mia 
the greatest evidence of bias—field ratings and probing — € boss 
specifications for each survey further alert the interviewer Ka = 
by noting the areas in which it is most likely to occur, and they е 
deavor to standardize just such matters as probing behavior m oe 
question and the criteria to be used in field ratings. Evidences е os 
are also considered in determining the NORC interviewers 1 E о x 
ance ratings on each survey. Marked or unusual patterns ja я 
sponses, the repetition of particular words or phrases іп D a 
replies, indications that suggestive probes have been used, devian E. 
havior as revealed by comments on the interviewer's report for жа 
such weaknesses are always noted and pointed out to the interview 
in the letters they receive from the office. . of 
The frequent letters from the office, in addition to their perpen 
training and also education, are designed to maintain and ai apii a 
interviewer's morale by demonstrating that his problems are un Ya 
stood, that his work is appreciated and used, and that his complits i 
difficulties receive sympathetic attention. Letters containing a pr 
deal of criticism are so phrased as not to discourage the interview | 
and the more skilful members of the staff often receive personal oh 
munications which contain only praise and thanks for their good pe 
Even these superior workers, however, are constantly encourage p 
think about interviewing problems and to work toward still eite 
skill and efficiency. Various other devices are employed in these e 
to make the distant interviewer feel that he is an integral part of an 
organization: events in his personal life which come to the office 5 аё 
tention (for example, a child’s illness or a daughter’s graduation) fed 
acknowledged and commented on; he may be given some те, а 
information about a forthcoming survey or about the uses to W 4 de- 
past survey was actually put; he may be asked to supply us wit 
scriptive or statistical data about the community he lives in, etc. sa 
Further to keep the isolated interviewer in touch with the offic e 
monthly newsletter (usually four mimeographed pages in newspaP 
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layout) is mailed to each member of the staff, including those who are 
temporarily inactive. This newsletter, designed to be both informative 
and entertaining, contains humorous anecdotes submitted by the inter- 
viewers, results of past surveys, suggestions on interviewing techniques, 
Stories about particular interviewers who have distinguished themselves 
1n one way or another, news about plans for prospective surveys and 
the schedule for the immediate future, occasional stories about the 
activities of the office staff, etc. Inexpensive gifts are sent to each mem- 
ber of the staff at Christmas time, and occasionally interviewers with 
very superior records or long service receive special awards. 

A further incentive to conscientious work lies in a sliding scale of 
Pay, based in part on the interviewer's ratings and in part on his length 
of service, He starts at the minimum figure, which, after his comple- 
Чоп of four assignments with satisfactory ratings, is advanced to a 
Somewhat higher rate. On the completion of ten assignments (usually 
about a year later), and provided his ratings are above average (in the 
Upper 40 per cent), he is raised to the highest rate. Thus, at least until 
he attains the maximum rate, there is a financial incentive for the inter- 
Viewer to accept as many assignments as are offered to him and to strive 
10 Correct any deficiencies reported to him in letters from the office. 
By the time he attains the highest rate, interest in the work and pride 
în his performance generally assure his continued diligence. 7 
b NORC interviewers are paid by the hour, on a “portal-to-portal 
ds and are reimbursed for all necessary expenses an В 
d п, phone calls, postage, parking fees, etc. The hourly ra P 

Onsiderable variation in charges from one interviewer to another, as a 


"e of differential interviewing efficiency and of wx un in 
h 3 
УР of quota assi eather, etc. But this method of pay 

quota assigned, the w ] епо iteE- 


id believed to encourage more skilful and more etu cansa 
*Wing, since it removes the temptation to do careless or C. 

Work for the sake of speed. The interviewer is paid for all the time he 

Spends on the job, and if he is handicapped by bad weather or is forced 


to make ^ or is detained by a particularly 
an unusual number of callbacks o Eisen Ti кепин Jay 


Barrulous res à " 
ondent, he is not penalize 

a tempts to шге advantage of the hourly method of Татица зр 

Padding” the number of hours listed as spent оп the job, ar y 


aPparent from a routine cost analysis. Interviewers whose e ei 
i “nusually high, when compared to other е are | 
ы Comparable assignments, are apprised of the fac € had 
sase their efficiency on future surveys, and invited to зуп : ee 
апу trouble they have in this respect. Those whose costs rema 
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consistently much higher than average are soon dropped from the staff 
unless special circumstances are involved. 

The volume of interviewing handled by the average NORC inter- 
viewer is not great. As we have noted, the great majority do not in- 
terview for any other agency, and NORC does not demand a great 
deal of their time. The typical NORC interviewer will complete about 
eight assignments per year, although the number may range from four 
to fourteen, depending upon his location, availability, competence, and 
the number of national surveys NORC has scheduled. Not only is he 
called on less than once a month, but when an assignment does come, Jt 
is usually a small one which can easily be completed in two or three 
days. Assignments generally range from twelve to twenty interviews, 
and the interviews themselves usually average about a half hour with 
each respondent. Most interviewers who can put in full days complete 
about ten interviews per day, although many of the staff prefer to 
interview only part-time and to distribute the work over the three or 
four days usually granted to them for completion. Assignments are 
generally sent on very short notice. An advance postal card is mailed 
to interviewers selected for the survey as soon as the mailing date 15 
known, but this card usually arrives only three or four days in advance 
of the actual survey materials. The interviewer is free to telegraph his 
inability to accept the assignment, without prejudicing his position 0n 
the staff, although frequent or consistent refusals will generally draw 4 
letter from the office suggesting that the interviewer be placed on the 
“temporarily inactive” list until such time as he can accept a larger 
share of the assignments offered him, Though interviewers always 
work in or near their home area, their specific assignments are usually 
rotated to avoid monotony. Thus, one assignment may call for near-by 
farms, the next one may specify residents of the interviewer's OWP 
town, and the following quota may send him to some adjacent city or 
county. 

NORC's national surveys of the period covered in this report were 
based on a form of quota sampling, restricted by the designation 0 
preselected blocks in most urban areas, Where such restrictions do not 
occur, the interviewer has quotas in terms of sex, two age groups, an 
four rental brackets, and each cell must be correctly filled. The inter- 
viewer generally knows in what parts of the city he can find peop ^ 
with homes of the assigned rental values, and within those neighbor 
hoods he strives to fill his sex and age quotas. At the beginning of 
assignment he can accept virtually anybody for his sample, but he soon 
begins to fill the small cells, and a considerable number of calls 15 
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usually necessary before he can fill the last few holes in his cross section. 
The interviewer is not supposed to interview his friends or relatives, he 
15 supposed to keep in mind the importance of such uncontrolled fac- 
tors as nationality, religion, education, etc.; and he is asked to scatter 
his interviews geographically by obtaining no more than three in any 
block nor six in any neighborhood. The recording of background data 
about each respondent, including his address, provides a check on the 
degree to which the interviewer complies with these requirements. 
Under the block-sampling procedure, a city’s blocks are stratified by 
Tent in the NORC office, and pairs of blocks are drawn at random 
from each stratum. Two sides of each of the designated blocks are then 
randomly specified for the interview, so that the total assignment con- 
Sists of clusters of four interviews on two blocks. The interviewer is 
free to select any dwelling unit on the assigned side of the assigned 
block, so long as he stays within his sex and age quotas. Callbacks are 
Sometimes required when the block-side contains only a few dwelling 
Units, and a substituting procedure is specified when no units at all are 
Available, E 
АП of the above considerations apply, of course, only to the regular 
RC national field staff. Some of the findings cited in this report are 
ased 9n special surveys conducted in particular areas and using a staff 
of Interviewers specially hired and trained for that job. Usually these 
зшүеуз employed some kind of probability sample. On such surveys, 


Ше type of interviewer hired and the nature of the employment and 
raning are probably not much different from those involved in any 
Othe y. Two or more office 
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interviewers are dismissed, although one or two of those who showed 
superior aptitude may be retained for the national staff provided their 
community can be used as a regular sampling point. | 

Since hiring and training procedures, administrative and supervisory 
practices, rates of pay, and volume of work will inevitably differ from 
one research agency to another, the findings we have cited in this re- 
port which are based on the NORC staff must be weighed in conjunc- 
tion with the descriptive information provided in the foregoing. It is 
most probable, however, that the similarities in the interviewer's task, 
from one agency to another, are immeasurably greater than the differ- 
ences among his employers, and that, except in very unusual circum- 
stances, what has been found true of the NORC interviewers will 
equally hold true for other people performing the same job. 
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Feldman, H. Hyman, and C. W. Hart, “A Field Study of AU OS), 73 
fects on the Quality of Survey Data,” Pub. Opin. Quart., XV (195 i ч attis 
61. For a fuller discussion of the relation between performance on t : pet 
tude-structure expectations test and the eliciting of ayali. ap m ctas 
H. L. Smith and H. Hyman, *The Biasing Effect of Liao хр 
tions on Survey Results,” Pub. Opin. Quart., XIV (1950), 491-5 i OA 
13. L. Guest, “A Study of Interviewer Competence," Internat. Jour. | 
Аий. Res., 1, No. 4 (1947), 17-30. | ite 
14. Examples of types of “bad” probes were: offering respondent alte was 
tives in the probe which should not be offered; asking a probe w within 
irrelevant to the objective of coding that particular reply, "wp scale 
the probe that the respondent’s opinion fell closer to an end о bes were: 
than respondent had previously indicated. Examples of good” pro ctition © 
requests for elaboration of answer, repetition of the question, rep 
the alternative choices. 


А b 
15. This canceling of gross effects is clearly demonstrated in the study БУ 
Marks and Mauldin, op. cit. А dencies, 
16. On the question of individual differences in error tende 
reader is also referred to Chapter VII. ‘entific Research (un- 
17. American Jewish Committee, Department of Scientific Resea 
ublished manuscript). un- 
p 18. These tudine — interviewed an average of twelve 
coached respondents each. ings,” 
19. Frederick Mosteller, “The Reliability of Interviewers’ aa pess; 
H. Cantril, Gauging Public Opinion (Princeton: Princeton University | 
1944), pp. 98-106. lie 
20. This study was done in co-operation with the Bureau of ApP 
Social Research, Columbia University. iability as indica~ 
21. For the opinion data, we cannot regard the total unreliabi сен this 
tive of gross effect, since opinions may well change in time. Horae since 
fact should not jeopardize the analysis of systematic effects over arison. 
Whatever real change has occurred should be a constant in the — r of re- 
22. An alternative method would be to use the ratio of the num oe sur- 
spondents either mentioning “courageous” on both surveys or on ne e com- 
vey to the total number of respondents. But the reliability cian о 
puted in this fashion is to some extent а function of the This ropor- 
respondents mentioning the attribute on cach of the eom аде te 
tion approaches 50 per cent, reliability computed in this fash 
diminish. " ined are | 
23. All the major studies in the literature that were examin 
in Appendix C. hi- 
24° The two subsamples within a block were not random er ge реп 
cally systematic samples. However, low correlation of is iol is and not- 
households, determined empirically, and some losses due to refusa a nt 
at-homes, probably make the sampling мас wy atten a accurate 
random samples, so that the Chi-squared test should be a reasonably 
test of significance. 
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jd aae we have used only the Chi-squareds cumulated over the ten 
the deo his analysis, the Yates continuity correction has not been used on 
minor dee that the correction would overcompensate for the only 
Cochran ou LA in the distribution of the cumulated statistic. See W. G. 
Sf Sein T. UU Correction for Continuity,” lows State Journal 
UE e (1942). 
27. iie eng Hyman, and Hart, op. cit. 
been dud the field Кыс. and open-ended questions have already 
28. B in chapter V x : | : 
fons a of these findings are discussed at length in Chapter V of this 
29 graph and in Feldman, Hyman, and Hart, op. cit. 
D. H. Cantril, op. cit., chap. 8, Parts 1, 3, 4a, and 5. 
the Cahalan, V. Tamulonis, and H. Verner, "Interviewer Bias Involved in 
Бе T Types of Opinion Survey Questions," Internat. Jour. Opin. Attit. 
ad vies 1 (1947), 63-77. 
Беса руе т between clusters and between resp 
ithe eis he observed variance between interviewers 
31 S uster and respondent variance: ‘ : 5% 
о pin ee, for instance, D. Katz, "Do Interviewers Bias Poll Results?” Pub. 
Cah i Quart., VI (1942); H. Cantril, op. cit.» chap. 8, Parts 1, 3, 4a, 4b, 4c, 5; 
terial ү Tamulonis, and Verner, ор. C't. Although from the published ma- 
erial it is not clear exactly how the analysis was made, Udow, “The Inter- 
Viewer Effect in Public Opinion and Market Research Surveys,” Archives 
о} Psychology, No. 277 (1942), seems to have been properly analyzed. — 
X. one study (H. Cantril, op. cit»; chap. 8, Part 2), where interviewers In- 
; Viewed non-interpenetrating samples of respondents, only the respondents 
in matched pairs of interviewers, interviewers with differing opinions but 
Working in the same general area, Were used in the analysis. Here again, 
Sein. p the analysis was made on the assumption that the aggregates of 
pondents interviewed by interviewers with given opinions were simple 
random samples. The factors previously discussed might tend to make the 
Sampling variances derived from the assumption of simple random selection 
àn underestimate, while the fact that only matched interviewers were used 
might lead to a positive correlation of the means of the response distribu- 
tions obtained by the different groups of interviewers and thus tend to make 
the simple random sampling variances of the differences an overestimate. 
32. J. S. Stock and J. Hochstim, “A Method of Measuring Interviewer 
Variability,” Pub. Opin. Quart., XV (1951), 322-34; Robert Ferber and 
ugh Wales, *Detection and Correction of Interviewer Bias," Pub. Opin. 


uart., XVI (1 T rview 
Q 4 XV 952), 107 27. чи _ 
33. See, VE wen età Albert Blankenship, he Effect of the Interviewer 
u khe Res se in a Public Opinion Poll,” Jour. Cons. Psychol., IV 


(1940); Udow, op. cit; Cantril, op. cit.; Cahalan, Tamulonis, and Verner, 
Op. cit; Е. Mosteller et al., The Pre-Election Polls of 1948 (New York: 


SSRC, 1949), chap. 7. 


34. J. Durbin and | | H 
Perienced and Inexperienced Interviewers,” Journal of the Royal Statistical 


Society, Series A, 114 (1951). We are grateful to Messrs. Durbin and Stuart 
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already contains within 
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я in ad- 
and Professor M. G. Kendall for making these data available to us in 
vance of publication. M ing in 
35. P. É Mahalanobis, "Recent Experiments in Statistical wie 
the Indian Statistical Institute,” Journal of the Royal Statistics 
CIX (1946). he prevalence of 
36. This is probably somewhat of an overstatement of ше Pa since 1t 
tatistically significant inter-interviewer variation in this Les ЕНД, {ог 
appears that the significance tests were not made properly. i Pe two in- 
each question, the test was made on the difference between st extreme 
terviewers who differed the 705г on that question. Thus, the mo 
of the six possible differences was selected in each case. xecuted by 
37. This finding is also confirmed by an excellent study nterviewets 
Daniel Horvitz, He found great variation between different sin of il- 
(and also between different types of interviewers) in the ae of course 
nesses that were reported to them in a morbidity study. This w + extremely 
essentially an open-end question situation where the results we S nd design 
dependent on the extent of probing by the interviewer. a. mpling an 
of the study makes the results conclusive. Daniel G. Horvitz, à pn Re- 
Field Procedures of the Pittsburgh Morbidity Survey,” Public j 
ports, LXVII (1952). | jn an In- 
38. Sam Shania and John Eberhart, “Interviewer Сарове er, 
tensive Interview Survey,” Internat. Jour. Opin. Attit. Res., І ( 

39. Stock and Hochstim, op. cit. d in the 
40. This conclusion was reached on the basis of data not presente 
Published article but kindly furnished us by the authors. » wo 

41. M. H. Hansen et al., “Response Errors in Surveys,” Journ 
American Statistical Association, XLVI (1951). ;d. “piffer- 
42. Durbin and Stuart, ор. cit; N. S. Booker and S. T. David, esti 
ences in Results Obtained by Experienced and Inexperienced Intervi 
Journal of Royal Statistical "Society, Series A, 115 (1952). 
43. Durbin, op. cit, ор. 
44. The detailed discussion appeared in Feldman, Hyman, and Hart, 
cit., PP- 749-50, 
Booker and David, op. cit. ion in Chapter У 
46. This finding gives further support to the demonstration in С "Ile to 
questions, contrary to usual view, may be more suscep 


f the 


€gro interviewers discussed in Chapter IV. 
48. Mahalanobis, Op. cit. ; ; ical 
9. For an extended discussion of different manifestations of aaa a 

bias, see Herbert Stember and Herbert Hyman, “How Interviewer E ^H 
Operate through Question Form," Internat. Jour. Opin. Attit. Res., 
(1949), 

50. Shapiro and Eberhart, op. cit., pp. 4, 5. 

ЭФ, Ibid., Pp. 16, 17. В ional 4 MEE 

52. Ferber and Wales report similar findings of an occas Interview 
deviating markedly from the mass. Op. cit. 
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NOTES TO CHAPTER VII 


1. See Chap. VI for a description of each type of error and the manner 
of scoring errors. 

2. Assuming that the correlations were based on the fifteen interviewers 
rather than the thirty-three interviews. 

3. Paul B. Sheatsley, “An Analysis of Interviewer Characteristics and 
Their Relationship to Performance—Part III,” Internat. Jour. Opin. Ай. 
Res, V (1951), 193-97. 

.. 4. L. Guest and К. Nuckols, “A Laboratory Experiment in Recording 
in Public Opinion Interviewing,” Internat. Jour. Opin. Attit. Кез, ТУ 
(1950), 346. 

_ 5 Duncan McRae, Jr., “Interviewer Performance in a Probability-Sam- 
pling Survey" (unpublished document on file at the National Research 
Council—Social Science Research Council sampling project). 

6. Assuming that social skills and intelligence are uncorrelated—and that 
they have about the same variance—and assuming that the social skills and 
the kind of intelligence required in eliciting free-answers and in com- 
Petently handling the clerical aspects are the same. This example is not 
Intended as a realistic representation of the constituents of the two abilities, 
but merely to show that the possession of some common elements will re- 
Sult in a moderate degree of correlation. 

7. L; Guest, “A Study of Interviewer Competence,” Internat. Jour. 
Opin, Attis, Res., 1, No. 4 (1947), 26. 

Guest and Nuckols, op. cit. . 

9. Dolores Anne Keyes, A Study of Interviewer Effect and Interviewer 
Competence (Master’s thesis, University of Denver, 1949). 

From data given in the A.J.C. report, we calculated the correlation 

tween total biasing errors and the total neutral errors to be .19. The bias- 
Neutral correlations for the various kinds of error would be even smaller. 

11. Guest and Nuckols, op. cit. 

12. This suggestion is supported also by the results of the Ferber study 
described later in this chapter. See Robert Ferber and Hugh Wales, *De- 
tection and Correction of Interviewer Bias,” Pub. Opin. Quart, XVI 
(1952), 106-27. Some of the interviewers obtained answers significantly 
More unlike their own opinions, and this phenomenon is termed by the 
Authors as "negative ideological bias." It seems more reasonable to explain 
Such а phenomenon on the basis of a theory of bias as random error. 

Some evidence on the association between different types of bias was 
Presented in the article by Ferber and Wales. They. compared the bias in 
Selection of respondents on background characteristics using judgment 
Sampling with the bias in responses obtained in the direction of the inter- 
Vlewer’s own opinions for fourteen interviewers. Only a moderate positive 
Correlation of .42, not statistically significant, was obtained, and owing to 
Certain necessary crudities in the methods of measuring the bias, this finding 
Probably overstates the degree of association. See ibid. 

14. One minor bit of evidence on the relationship between expectational 
Sources of bias and the routine skill of recording answers to simple pre- 
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coded questions was available in the Smith-Hyman study. — 
were classified into two groups on the basis of the number of re at 
made in coding answers to innocuous questions and compared with г ee 
to the errors they made on two questions testing “attitude-structure that 
pectations. No significant relationship was demonstrated, us ubi е 
Such a simple mechanical skill is not correlated with expectationa eer 
H. Smith and H. Hyman, “The Biasing Effect of арн Ехрес 
on Survey Results," Pub. Opin. Quart., XIV (1950), 491-5 В ji; 
15, Selden Menefee, “Recruiting an Opinion Field Staff, Pub. Ор? 
Quart., VIII (1944), 262-99. "m 72 
16. Ruth Cavan, "Interviewing for Life History Material,” Amer. Jot 
Sociol., XXXV (1929-30), 100-115. M -—" 
17. It is interesting to note that the indeterminacy in the a aaa Bc 
great on a trait most akin to “social orientation." In chapters П an EM 
showed by a lengthy theoretical discussion how complex is the infoa Нот? 
social orientation in the interview. This finding reveals quantitatively 
much confusion has attended this theoretical complexity. 
18. Guest, op. cit. 
19. Guest and Nuckols, op. cit. . besides 
20. If one considers other aspects of interviewer performance against 
error-proneness, such as dependability, Guest and Nuckols' caution абу 
selecting the better educated takes on added significance. Sheatsley tsley’ 
demonstrates that turnover increases with formal education. See Shea 
ор. cit., p. 207. a In- 
"n Biber Fisher, "Interviewer Bias in the Recording Operation, и 
ternat. Jour. Opin. Attit. Res., IV (1950), 394—411. 
22. Sheatsley, op. cit., Table 94. 
23. Keyes, op. cit. s 
24. These on were derived from the Allport-Vernon study of value 
and are defined in the terms of the test. ; 
25. Ronald Taft, Some Correlates of the Ability to Make Accurate Soci! 
Judgments (unpublished Ph.D. Dissertation, University of Califo 
Berkeley, 1950). : - 
26. ЕТ. Horley, memorandum based on research conducted in 6, 
many, for Columbia University Bureau of Applied Social Research, Ph 
AFIRM, under the auspices of the Human Resources Research Insti 
Air University, January, 1952. ее, the 
This specific finding is supported by Vernon who, after meique DUM 
general literature on the appraisal of personality, states: There is i 
good evidence that in the long run better judges are slightly superior in pon 
introverted, asocial tendencies. This latter finding may саше that ing 
extraverted, sociable person is less capable of standing bac E а 
others impartially.” See Р. Е. Vernon, The ID a sye ees 
Qualities by Verbal Methods, Medical Research Counci , In ors pos 
Research Board, Report No. 83 (London: H. M. nag d ce, 193 й 
Quoted by permission of the Controller of Her Britannic Majesty’s Station 
ery Office. 
27. See Chap. V, especially Table 55. 


Notes to Pages 299-316 405 


28. Smith and Hyman, op. cit., 505-6. 
‚29. Н. Cantril, Gauging Public Opinion (Princeton: Princeton Univer- 
Sity Press, 1947), pp. 147-49. 
30. In the study by Fisher alluded to earlier in the chapter, he reported 
а suggestive relationship between motor or clerical ability, as measured by 
à simple recording test, and selective or biased recording in the direction of 
the interviewer's ideology. However, in view of the statistical nonsignifi- 
cance of the Fisher finding, plus the Guest-Nuckols finding on the lack of 
any correlation between clerical ability as revealed on the Minnesota test 
and ideological bias, it would seem that ideological bias is not predicted 
from simple motor or clerical ability. 
‚31. Edwin Ghiselli, “The Validity of Commonly Employed Occupa- 
Попа] Tests,” University of California Publications in Psychology, V 
(1949), 267, 
32. E. S. Marks and W. P. Mauldin, “Response Errors in Census Re- 
es Journal of the American Statistical Association, XLV (1950), 435. 
Iso See unpublished reports of the office of the Statistical Adviser to the 
vgs Bureau of the Census, Department of Commerce. _ . 
Baa communication from Louis Moss, Director, British Social 
34. Page 99, ff. А 
i 35. The manuals that were examined included the following: "Interview- 
ng for NORC,” National Opinion Research Center; “Manual for Public 
ica) Reporters," American Institute of Public Opinion ( Gallup); “The 
Oe A a Guide,” Institute of Market Research; “Interviewers Hand- 
ne: Elmo Roper; and *A Manual for Interviewers," Survey Research 
36. University of Michigan. . " T— A 
secti, he detailed information about NORC interviewers cited in this 
mre is based on the previously cited articles by Sheatsley, and on the 
ha Tiestionnaire administered to NORC’s current staff, described in 
37 Н and Appendix В. . ВЕ , 
Жек Cantril, Gauging Public Opinion (Princet 
ty Press, 1947), pp. 147-49. 
CN MA. Op. cit. 

- Stanley Т, Payne, “ aes 
ХШ (1949) рх Ра ne, “Interview . 
` ernard J. Covner, “Studies in Phonograph 
terial: IV. Written Reports of Interviewers,’ 

Ш (1944), 89-98, М 

Quart Oseph C. Bevis, “Interviewing with Tape Recorders, 
21 XIII (1949), 629-34. 
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on: Princeton Uni- 


er Memory Faults,” Pub. Opin. Quart., 
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^ Jour. App. Psychol., 


^ Pub. Opin. 
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o difficult for 12 per cent of the resp 

кыйс for 23 per ped por E per cent too difficult for almost це 
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4 Opin, Quart., ХШ (1949), 314-19. 
44, ТЕЙ, op. cit., pp. 118, 286-88. 


П general, it can be shown that the average distortion under Case 1 
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= 5) 
is B/4, where B is the total bias, while for Case 2 it E m (P au 
where p is the per cent of pro-interviewers, and hence the a 
foran €qual distribution of interviewers, ill range from Охо 
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45. See Р, C. Mahalanobis, "Recent Experiments in Statistic 
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(1946), 325-70. 
46. L. J. O'Rourke, “Меаѕиг 
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ychological Measurement, x ы be distin- 
noted in chapter V], “gross interviewer effect” is Many oe 
guished from the total error which may occur in a аа Мне НЕ th 
may occur which do not result in a deviation ight erron?” 
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prescribed wording of a question and still e so that the 
answer, or Tather, the “true” response to the prescribed question, s 
error does not become effective error, asuri 
49. J. Stevens Stock and Joseph R. Hochstim, “A Merna ed mn 
Interviewer Variability,” рир, Opin. Quart., XV (1951), 322-31. 
50. Ibid. 
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where n, is the number of respondents interviewed by the i-th interviewer. 

Thus the approximation which assumes equal size of interviewer assign- 

ments gives the same result as the more exact formula to 6 decimal places in 

this case. 

52. P. C. Mahalanobis, op. cit. | 

53. This tendency for interviewer effect to locate within occasional aber- 
rant interviewers was also noted in the Denver and Cleveland findings re- 
ported in chapter VI. F 

54. This statement does not seem completely justified, since we are only 
sure that error due to inter-interviewer variability was eliminated. Con- 
sistent bias over all interviewers may still have been present. 

55. Morris H. Hansen et al., “Response Errors in Surveys,” Journal of the 
American Statistical Association, XLVI (1951), 147-90. 

56. Ferber and Wales, op. cit. . К 

57. Since the subsamples for interviewers were interpenetrating, the ex- 
Pectation is that differences between the two distributions could be ac- 
counted for by random sample fluctuations. _ 

58. See “Labor Force Memorandum No. 5” of the Current Population 
Fd ar U.S. Bureau of tbe Census, November 8, 1950, or Estadistica, 

arcb 1948, Vol. VI, No. 18. 

ү. Frederick Mosteller et al., The Pre-Election Polls of 1948 (New 
ork: SSRC, 1949 . 211-12. А 
60. See, for oe Phillip M. Hauser, “Some Aspects of Methodologi- 

cal Research in the 1950 Census,” Pub. Opin. Quart., XIV (1950), 5-13. 

In addition to the use of the re-interview data as a basis for the adjust- 
Ment, the Census also will check the enumeration data against independent 
Tecords such as birth certificates and presumably derive additional empirical 
adjustments, " 

61. For a detailed discussion of scaling methods, the reader is referred to 
S. Stouffer et al., Measurement and Prediction (Princeton: Princeton Uni- 


versity Press, 1950). 
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